MAF v1 — Adding Tools (Python + .NET)

Series note — This article is part of MAF v1: Python and .NET. The Python-only predecessor Part 3 — Building Domain-Specific Tools covers production patterns against a real database. This chapter is the canonical “one tool, no dependencies” introduction — once the decorator shape clicks, every tool in the capstone is more of the same.

Repo — Runnable code for this chapter: tutorials/02-add-tools. Clone, cd in, follow along.

Why this chapter
#

A tool is what turns a chat-only agent into something that can actually do things — look up data, place an order, send an email. The agent in Chapter 01 could only answer from the LLM’s training data. Give it one function and the behaviour shifts: now the LLM decides whether a real-world lookup is needed, and MAF handles the round trip.

We’ll add exactly one tool — a weather lookup backed by a hard-coded dictionary. Boring data on purpose. The mechanics are the whole point.

Prerequisites
#

Completed Chapter 01 — Your First Agent.
.env at the repo root with either OPENAI_API_KEY or the Azure OpenAI trio.
Read-first (optional): Agents — Tools and Journey — Adding Tools.

The concept
#

A MAF tool is a function the agent can call, defined in user code and executed by the framework. It has three parts:

A function — regular Python or C#, sync or async, nothing special.
A name and description — what the LLM sees when deciding which tool to call.
Typed parameters with descriptions — MAF converts these into a JSON schema the LLM fills in.

The critical thing to internalise: the LLM does not execute your function. It cannot. It only emits structured tokens that name the tool and its arguments — a short JSON-ish fragment embedded in its response. MAF parses those tokens, looks up the function in the tools list you passed to the agent, calls it with the parsed arguments, and feeds the return value back into the conversation for the next LLM turn. The function runs inside your process, under your permissions. That gap — between “the LLM asked” and “the framework executed” — is the fundamental safety boundary every agent relies on.

This cycle is the tool-calling loop: call LLM → if it returned a tool call, invoke the function → append the result to the conversation → call LLM again → repeat until the LLM produces a normal text answer with no further tool calls. MAF runs the loop for you. You write the function; the framework wires the rest.

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor': '#2563eb','primaryTextColor': '#ffffff','primaryBorderColor': '#1e40af', 'lineColor': '#64748b','secondaryColor': '#f59e0b','tertiaryColor': '#10b981', 'background': 'transparent'}}}%% flowchart LR classDef core fill:#2563eb,stroke:#1e40af,color:#ffffff classDef external fill:#f59e0b,stroke:#b45309,color:#000000 classDef success fill:#10b981,stroke:#047857,color:#ffffff classDef infra fill:#64748b,stroke:#334155,color:#ffffff user([User question]) agent[Agent] llm[(LLM)] tool[[get_weather tool]] answer([Final answer]) user --> agent agent -- "prompt + tool schemas" --> llm llm -- "decides to call tool" --> agent agent -- "invokes function" --> tool tool -- "result" --> agent agent -- "result in context" --> llm llm -- "final text" --> agent agent --> answer class agent core class llm external class tool core class answer success

The LLM never executes the function — it asks the framework to, then sees the result in its next context window. The agent is the blue coordinator; the tool is the blue hexagon; the orange cylinder is the only thing that actually talks to the model.

What the LLM actually sees
#

Both SDKs serialise your tool into an OpenAI-compatible JSON schema before the first turn. For our get_weather example, the schema that ships to the model looks roughly like this:

{
  "name": "get_weather",
  "description": "Look up the current weather for a city.",
  "parameters": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "The city to look up, e.g. 'Paris'."
      }
    },
    "required": ["city"]
  }
}

You don’t write that schema. MAF builds it from your Python type hints plus Field(...) descriptions, or from your C# parameters plus [Description] attributes. The LLM reads it to decide whether to call and how to format the arguments. Every sentence you put in a description is a sentence the LLM uses for ranking.

Jargon recap
#

Tool — a function (or hosted capability like code interpreter) the agent can call. Defined in user code; executed by the framework; results fed back to the LLM.
@tool decorator (Python) — marks a function as a tool. MAF builds the JSON schema from Annotated[type, Field(description=...)] annotations.
[Description] attribute (.NET) — decorates a method and its parameters so AIFunctionFactory.Create() can build the JSON schema MAF advertises to the LLM.
AIFunctionFactory.Create (.NET) — turns a regular method into an AIFunction the agent can accept in its tools: list.
Tool-calling loop — the cycle where the LLM emits a structured token that names a tool and its arguments; the framework parses it, invokes the function, feeds the result back into the conversation, and asks the LLM for a final response. MAF runs this loop for you.
Annotated[...] + Field (Python) — the Pydantic-style pair that attaches a description to a parameter so MAF can include it in the JSON schema.
JSON schema — the contract the LLM uses to format its tool call. You never write it by hand; MAF derives it from your type hints and descriptions.

Full definitions in the jargon glossary.

Python
#

Full source: python/main.py. Key lines:

# python/main.py (excerpt)
from typing import Annotated
from agent_framework import Agent, tool
from pydantic import Field

INSTRUCTIONS = (
    "You are a helpful assistant. "
    "When the user asks about the weather in a city, call the `get_weather` tool. "
    "For other questions, answer directly in one short sentence."
)

@tool(name="get_weather", description="Look up the current weather for a city.")
def get_weather(
    city: Annotated[str, Field(description="The city to look up, e.g. 'Paris'.")],
) -> str:
    canned = {
        "paris": "Sunny, 18°C, light breeze.",
        "london": "Overcast, 12°C, light drizzle.",
        "canberra": "Partly cloudy, 21°C.",
        "tokyo": "Rain, 15°C.",
    }
    return canned.get(city.lower(), f"No weather data for {city}.")

def build_agent(client=None) -> Agent:
    return Agent(
        client or _default_client(),
        instructions=INSTRUCTIONS,
        name="weather-agent",
        tools=[get_weather],
    )

Three things worth staring at:

The instructions explicitly mention the tool name. The LLM is not psychic; making the tool discoverable in the system prompt raises the hit rate dramatically.
Each parameter is Annotated[type, Field(description=...)]. The type hint drives the JSON schema; the Field description is what the LLM reads to decide whether to call. Treat descriptions as product copy for the model.
tools=[get_weather] passes the decorated function directly. No registry, no config file. MAF inspects the object to pull out the schema.

Run it:

cd tutorials/02-add-tools/python
uv sync
uv run python main.py "What's the weather in Paris?"
# Q: What's the weather in Paris?
# A: The weather in Paris is sunny, 18°C, with a light breeze.

Ask the same agent something unrelated — “What is 2 + 2?” — and MAF bypasses the tool entirely. The LLM only calls it when the description matches the question.

.NET
#

Full source: dotnet/Program.cs. Key lines:

// dotnet/Program.cs (excerpt)
using System.ComponentModel;
using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;

public const string Instructions =
    "You are a helpful assistant. "
    + "When the user asks about the weather in a city, call the get_weather tool. "
    + "For other questions, answer directly in one short sentence.";

[Description("Look up the current weather for a city.")]
public static string GetWeather(
    [Description("The city to look up, e.g. 'Paris'.")] string city)
{
    var canned = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
    {
        ["Paris"] = "Sunny, 18°C, light breeze.",
        ["London"] = "Overcast, 12°C, light drizzle.",
        ["Canberra"] = "Partly cloudy, 21°C.",
        ["Tokyo"] = "Rain, 15°C.",
    };
    return canned.TryGetValue(city, out var forecast) ? forecast : $"No weather data for {city}.";
}

public static AIAgent BuildAgent()
{
    var chatClient = /* same factory as Ch01 */;
    var tools = new AITool[] { AIFunctionFactory.Create(GetWeather) };
    return chatClient.AsAIAgent(
        instructions: Instructions,
        name: "weather-agent",
        tools: tools);
}

The two decorators ([Description] on the method and on each parameter) serve the same purpose the Python Field(description=...) calls do — they feed the JSON schema the LLM sees. AIFunctionFactory.Create(GetWeather) reads those attributes, inspects the method signature, and returns an AIFunction that MAF can advertise to the model.

Run it:

cd tutorials/02-add-tools/dotnet
dotnet run -- "What's the weather in Paris?"
# Q: What's the weather in Paris?
# A: The weather in Paris is sunny, 18°C, with a light breeze.

Structured outputs — when you want a typed result, not prose
#

A tool returns whatever its function returns, and the LLM then prose-wraps that result for the user. Sometimes you want the agent itself to return a typed object — a parsed intent, a classification, a domain record — not a paragraph. MAF supports that via structured outputs.

Python — hand a Pydantic model as response_format:

from pydantic import BaseModel

class WeatherReport(BaseModel):
    city: str
    conditions: str
    temperature_c: int

agent = Agent(client, instructions=INSTRUCTIONS, tools=[get_weather])
response = await agent.run(
    "What's the weather in Paris?",
    response_format=WeatherReport,
)
# response.value is a WeatherReport instance

.NET — use the generic RunAsync<T>():

public record WeatherReport(string City, string Conditions, int TemperatureC);

var agent = chatClient.AsAIAgent(instructions: Instructions,
    tools: new AITool[] { AIFunctionFactory.Create(GetWeather) });
var result = await agent.RunAsync<WeatherReport>("What's the weather in Paris?");
// result.Value is a WeatherReport

Both sides drive the same mechanism under the hood: the LLM is told to emit JSON matching a schema derived from the type. MAF parses and deserialises. Works on top of tool calls — the LLM can still call get_weather mid-run, then shape the final answer into the requested type.

Full treatment: Agents — Structured Outputs.

Appendix — hosted tools at a glance
#

Everything above is a function tool: code you wrote, running in your process. MAF also supports hosted tools — capabilities the provider runs for you. They show up as special AITool instances you hand the agent the same way you’d hand it AIFunctionFactory.Create(...).

The four common ones:

Hosted tool	What it does
Code Interpreter	The provider spins up a sandboxed Python kernel, lets the agent write and execute code, and feeds results back.
File Search	The provider indexes files you upload and the agent queries them as a RAG backend.
Web Search	The provider runs web queries and returns ranked results.
Hosted MCP	The provider connects to an MCP server you point it at and surfaces those tools as hosted capabilities.

Availability varies by provider — OpenAI supports all four; Azure OpenAI’s coverage depends on the deployment and API version; third-party providers implement subsets. Consult the provider matrix before committing to a hosted tool. Runnable examples are out of scope for this chapter — the goal here is to know they exist so you don’t reinvent them as function tools.

Side-by-side differences
#

Aspect	Python	.NET
Tool declaration	`@tool(name=..., description=...)` decorator on function	`AIFunctionFactory.Create(method)`
Parameter metadata	`Annotated[type, Field(description=...)]`	`[Description(...)]` attribute on each parameter
Method-level metadata	`description=...` on the decorator	`[Description(...)]` attribute on the method
Passing to agent	`Agent(..., tools=[my_tool])`	`.AsAIAgent(..., tools: new AITool[] { ... })`
Calling underlying function directly	`my_tool.func(...)`	Call the method as normal C#
Async	`async def` — MAF awaits	`async Task<T>` / `ValueTask<T>` — MAF awaits
Schema source	Pydantic-derived from `Annotated` hints	Reflection + `[Description]` attributes
Structured output	`response_format=MyPydanticModel`	`RunAsync<TMyType>(...)`

Structurally identical. Python hangs its metadata off the decorator and Pydantic; .NET hangs it off attributes and reflection.

Gotchas
#

The system prompt must mention the tool. The LLM only calls tools it “remembers” exist. If your get_weather tool is defined but the instructions don’t nudge toward it, expect hallucinated weather instead of tool calls. Named references (“call get_weather”) outperform generic hints.
Descriptions are the ranking signal. Given five candidate tools, the LLM ranks them by description relevance, not by name similarity. A cryptic name with a precise description wins.
Azure api_version matters for tools. Older Azure API versions drop tool-call parts silently. Use 2024-10-21 or newer; the default in the shipped code works against every region the series was tested in.
Python @tool returns a FunctionTool, not the plain function. Direct-invocation tests need get_weather.func(...) — see tests/test_add_tools.py for the pattern.
Don’t raise inside a tool without thinking. An uncaught exception propagates up to the run, not back to the LLM as “tool failed.” If you want the LLM to retry differently on failure, return a string describing the error; if you want the run to abort, raise.
One tool per concern. A single tool that does “search or fetch by ID or list all” is harder for the LLM to pick correctly than three named tools. The production capstone has seven narrow product-discovery tools for exactly this reason — see agents/python/product_discovery/tools.py:15-92.

Tests
#

Both sides ship fast unit tests (tool function in isolation) plus credential-gated integration tests that hit the real LLM:

# Python — 6 tests: 3 unit (canned data, unknown city, case insensitive),
# 1 structural (tool registered on agent), 2 integration (LLM invokes tool,
# LLM skips tool on unrelated question)
cd tutorials/02-add-tools/python
uv run pytest -v

# .NET — 5 tests: 3 unit, 2 integration
cd tutorials/02-add-tools/dotnet
dotnet test tests/AddTools.Tests.csproj

11 tests total. The two integration tests in each language are the interesting ones: they assert the LLM does call the tool on “weather in Paris?” (canned Sunny / 18 leaks into the prose answer) and does not call it on “capital of France?” (no canned strings in the answer). That’s how you verify the ranking signal is working end-to-end.

How this shows up in the capstone
#

Every specialist agent in the capstone is this pattern multiplied:

Python tools canonical — agents/python/product_discovery/tools.py:15-92 defines seven @tool-decorated functions (search_products, get_product_details, semantic_search, compare_products, get_trending_products, and more). Same Annotated[..., Field(description=...)] shape, same tools=[...] wiring. The only real difference: these hit Postgres via get_pool() instead of a dict literal.
Orchestrator tool composition — agents/python/orchestrator/agent.py:25-80 shows the pattern composed with request-scoped context. The call_specialist_agent tool reads current_user_email.get() and current_conversation_history.get([]) from ContextVars rather than taking them as parameters. That keeps the JSON schema the LLM sees narrow (two args — agent_name, message) while still propagating identity and history under the covers.
.NET mirror — agents/dotnet/src/ECommerceAgents.Orchestrator/Agent/OrchestratorTools.cs is the exact same tool in C#. One [Description]-decorated method, wrapped with AIFunctionFactory.Create(CallSpecialistAgent, nameof(CallSpecialistAgent)), exposed through an All() enumerable that the factory passes to .AsAIAgent().
Input validation — agents/python/tests/test_tool_input_validation.py exercises the schema boundary: invalid arguments from the LLM are rejected before the tool body runs. Pydantic handles the coercion; your tool body only sees validated inputs.
Destructive-action gates — agents/python/tests/test_destructive_tool_gates.py verifies that tools which mutate state (cancel order, initiate return) reject calls when the caller isn’t authorised. The authorisation comes from ContextVars, not tool arguments — the LLM cannot impersonate another user just by passing a different email string.

Production tool patterns layer validation, authorisation, database access, and telemetry on top of this chapter’s 15-line weather example. Nothing in the decorator shape changes.

What’s next
#

Chapter 03 — Streaming and Multi-turn keeps the same agent but swaps run() for a streaming variant and shows how to hold a conversation across turns. The tool we wrote today comes along unchanged — same decorator, same schema, same loop — it just gets called inside a richer interaction surface.