MAF v1 — Context Providers (Python + .NET)

Series note — Chapter 05 of MAF v1: Python and .NET. The Python-only predecessor Part 8 — Agent Memory: Remembering Across Conversations focused on long-term vector memory. This chapter is the primitive underneath that: the MAF-native hook that runs before every agent turn and extends the system prompt with whatever your request actually needs — user profile, recent activity, retrieved documents, feature flags, anything.

Repo — Runnable code for this chapter: tutorials/05-context-providers. Clone, cd in, follow along.

Why this chapter
#

Every real agent needs to know who it’s talking to. “Alice, gold tier, last order was a refund two days ago” is the kind of context that decides whether an answer is useful or generic. The question is where that context lives.

The wrong answer is to smash it into the system prompt at agent-construction time:

# do not do this
instructions = f"You are a shopping assistant. The user is {user.name} ({user.email}). Their tier is {user.tier}."
agent = Agent(client, instructions=instructions, ...)

That binds the agent to one user for its whole lifetime, forces you to rebuild the agent on every request, and scatters string interpolation across your code. As soon as you want to add recent orders or memories on top, the instructions block balloons and your factory turns into a format() shop.

The right answer in MAF is a ContextProvider (Python) / AIContextProvider (.NET) — a small object that fires before every LLM call, reads whatever it needs, and appends to the instructions for this run only. The agent itself stays static; the context is request-scoped; each concern gets its own provider that you can swap or omit per agent. MAF also ships TextSearchProvider — the same mechanism wired to a search function for drop-in RAG.

Prerequisites
#

Completed Chapter 04 — Sessions.
.env at the repo root with either OPENAI_API_KEY or the Azure OpenAI trio.
Read-first (optional): Agents — Context Providers and Journey — Context Providers.

The concept
#

A context provider is a tiny, async object with one job: run before the LLM call and return extra context to merge into this turn’s prompt.

The contract in both stacks is the same:

MAF receives agent.run(...) / agent.RunAsync(...).
For each registered provider, MAF invokes the before-run hook (before_run in Python, ProvideAIContextAsync in .NET).
The provider reads whatever it needs (DB, Redis, an HTTP API, a feature flag service) and declares what to inject — extra instruction text, extra messages, or extra tools.
MAF merges every provider’s output into the outgoing request. The LLM sees one merged system prompt; your provider code never touched it.
The agent’s static instructions and the merged per-run additions ship to the model together.

The trick is that providers compose. A production agent typically has three or four stacked — a profile provider, a recent-orders provider, a memories provider, maybe a feature-flag provider — and each one is free to read state that an earlier provider populated.

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor': '#2563eb','primaryTextColor': '#ffffff','primaryBorderColor': '#1e40af', 'lineColor': '#64748b','secondaryColor': '#f59e0b','tertiaryColor': '#10b981', 'background': 'transparent'}}}%% flowchart LR classDef core fill:#2563eb,stroke:#1e40af,color:#ffffff classDef external fill:#f59e0b,stroke:#b45309,color:#000000 classDef success fill:#10b981,stroke:#047857,color:#ffffff classDef infra fill:#64748b,stroke:#334155,color:#ffffff req([Incoming request]) agent[Agent.run / RunAsync] p1[[UserProfileProvider]] p2[[RecentOrdersProvider]] p3[[AgentMemoriesProvider]] db[(Postgres / stores)] ctx[AIContext
extend_instructions] prompt[Merged system prompt] llm[(LLM)] answer([AgentResponse]) req --> agent agent -- "before_run / ProvideAIContextAsync" --> p1 agent --> p2 agent --> p3 p1 -- "read profile" --> db p2 -- "read orders" --> db p3 -- "read memories" --> db p1 --> ctx p2 --> ctx p3 --> ctx ctx --> prompt prompt --> llm llm --> agent agent --> answer class agent,p1,p2,p3,ctx core class llm external class db infra class answer success class prompt infra

Three providers, one merged prompt. Each provider owns its concern; the agent’s static instructions never change across requests. The AIContext box is where MAF collects everything your providers returned and folds it into the system prompt for this turn only.

Jargon recap
#

ContextProvider (Python) / AIContextProvider (.NET) — abstract base class you subclass. One instance per concern (profile, orders, memory). Registered on the agent; MAF runs each one before every LLM call.
before_run (Python) — the async method you override. Signature: async def before_run(self, *, agent, session, context, state). Called by MAF before each turn; mutates context / state to inject your additions.
ProvideAIContextAsync (.NET) — the protected async method you override. Takes an InvokingContext, returns an AIContext. The public InvokingAsync is sealed — the framework owns it and calls your override internally.
InvokingContext (.NET) — carries the agent, the session, and the current messages into your override so you can decide what to inject based on what the user just asked.
AIContext (.NET) / context.extend_instructions (Python) — the return/output shape. Holds additional Instructions, Messages, and Tools MAF will merge into this request.
extend_instructions(source_id, text) (Python) — the primary way to append system-prompt lines from a provider. source_id tags the contribution so MAF can dedupe.
source_id — a string identifier for each contribution ("user-profile", "recent-orders"). Used for deduplication and debug output. In Python it’s required on both super().__init__(source_id=...) and every extend_instructions(source_id, ...) call.
AIContextProviders (.NET) — the collection on ChatClientAgentOptions. Hand it an array of providers; MAF executes them in order before every run.
TextSearchProvider — the RAG context provider MAF ships. Wraps a search function and injects matching documents into context before the LLM runs.
RAG (Retrieval Augmented Generation) — fetch relevant documents at runtime and hand them to the LLM as context. In MAF it’s a context provider like any other.

Full definitions in the jargon glossary.

Code walkthrough
#

Full source: dotnet/Program.cs. Key lines:

// dotnet/Program.cs (excerpt)
using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;

public const string Instructions =
    "You are a personal shopping assistant. "
    + "Greet the user by name if you know it. Keep answers short.";

public sealed class UserProfileProvider : AIContextProvider
{
    public string Email { get; }
    public string Name { get; }
    public string LoyaltyTier { get; }

    public UserProfileProvider(string email, string name, string loyaltyTier = "silver") =>
        (Email, Name, LoyaltyTier) = (email, name, loyaltyTier);

    // MAF invokes this before each agent run. Return an AIContext whose
    // Instructions get merged into the system prompt for this turn.
    protected override ValueTask<AIContext> ProvideAIContextAsync(
        InvokingContext context,
        CancellationToken cancellationToken = default) =>
        ValueTask.FromResult(new AIContext
        {
            Instructions = $"Current user: {Name} ({Email}). Loyalty tier: {LoyaltyTier}.",
        });
}

public static AIAgent BuildAgent(AIContextProvider provider)
{
    var chatClient = /* OpenAI or Azure OpenAI ChatClient */;
    return chatClient.AsAIAgent(new ChatClientAgentOptions
    {
        Name = "personalized-agent",
        ChatOptions = new ChatOptions { Instructions = Instructions },
        AIContextProviders = new[] { provider },
    });
}

Two things worth staring at:

The override point is the protected ProvideAIContextAsync, not the public InvokingAsync. InvokingAsync is sealed — MAF owns it, and it calls your ProvideAIContextAsync internally. Trying to override InvokingAsync is a compile error; it’s the framework’s way of guaranteeing providers don’t bypass the context-merge logic.
You return a new AIContext { Instructions = "..." } each call. The record has three settable fields: Instructions (string), Messages (list of chat messages to inject directly), and Tools (extra AIFunction instances for this turn only). All three are nullable — populate what you need.

Run it:

cd tutorials/05-context-providers/dotnet
dotnet run
# A: Welcome back, Alice! You're on our Gold loyalty tier.

dotnet run -- bob@example.com Bob silver
# A: Hi Bob, you're on our Silver loyalty tier.

Full source: python/main.py. Key lines:

# python/main.py (excerpt)
from typing import Any
from agent_framework import Agent, ContextProvider

INSTRUCTIONS = (
    "You are a personal shopping assistant. "
    "Greet the user by name if you know it. "
    "Keep answers short."
)

class UserProfileProvider(ContextProvider):
    """Injects the current user's profile into every run."""

    def __init__(self, *, email: str, name: str, loyalty_tier: str = "silver") -> None:
        super().__init__(source_id="user-profile")
        self.email = email
        self.name = name
        self.loyalty_tier = loyalty_tier

    async def before_run(
        self, *, agent: Any, session: Any, context: Any, state: dict[str, Any]
    ) -> None:
        context.extend_instructions(
            "user-profile",
            f"Current user: {self.name} ({self.email}). Loyalty tier: {self.loyalty_tier}.",
        )
        # Stash on state so downstream providers or tools can read structured values.
        state["user"] = {"email": self.email, "name": self.name, "loyalty_tier": self.loyalty_tier}


def build_agent(provider: ContextProvider, client=None) -> Agent:
    return Agent(
        client or _default_client(),
        instructions=INSTRUCTIONS,
        name="personalized-agent",
        context_providers=[provider],
    )

Three things worth staring at:

super().__init__(source_id="user-profile") — the base class requires a source_id. Skip it and the constructor raises. Match it with the same tag on every extend_instructions(...) call so MAF can track which provider contributed what.
before_run receives a state dict. That dict is shared across providers within one run — earlier providers can leave values for later ones. The capstone’s RecentOrdersProvider reads state["user"]["email"] that UserProfileProvider wrote moments earlier. Same pattern scales to memory, feature flags, A/B buckets.
context_providers=[...] on the Agent is a static list, but the output of each provider is request-scoped. One agent instance, many concurrent runs, each with its own injected context — because before_run fires per-call, not per-construction.

Run it:

cd tutorials/05-context-providers/python
uv sync
uv run python main.py
# A: Hi Alice! You're on our gold loyalty tier — thanks for being a valued customer.

# Swap the user without rebuilding the agent factory:
uv run python main.py bob@example.com Bob silver
# A: Hi Bob! You're on our silver loyalty tier.

The agent’s static INSTRUCTIONS never mention Alice or Bob. The tier-specific greeting only works because the provider injected the profile line into the prompt for that run.

RAG — `TextSearchProvider` out of the box
#

RAG is just a context provider with a search function wired in. Once you’ve seen UserProfileProvider, TextSearchProvider is the same shape — it just reads from your retrieval backend instead of a hand-held field. Think of it as the “do retrieval, paste the top-k snippets into the next prompt” pattern, pre-packaged.

The flow:

MAF calls ProvideAIContextAsync before the LLM call.
TextSearchProvider inspects the last user message (by default — configurable) and calls the search function you passed it.
Matching documents come back, get formatted, and get injected via AIContext.Instructions.
The LLM sees the user question plus the retrieved snippets in a single turn.

using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;

// 1. Your search adapter — anything that returns snippets.
async Task<IReadOnlyList<TextSearchResult>> SearchAsync(string query, CancellationToken ct)
{
    // Replace this with a call to Azure AI Search, a vector DB, your API, etc.
    var hits = await _vectorStore.QueryAsync(query, topK: 3, ct);
    return hits.Select(h => new TextSearchResult(h.Text) { Name = h.Title }).ToList();
}

// 2. Wire it into a TextSearchProvider.
var rag = new TextSearchProvider(
    searchFunc: SearchAsync,
    options: new TextSearchProviderOptions
    {
        SearchTime = TextSearchProviderSearchTime.BeforeAIInvoke,   // fire pre-LLM, every turn
        ContextPrompt = "Use the following product documents to answer:",
        FunctionToolName = null,   // optional tool surface, omitted here
        MaxResults = 3,
    });

// 3. Register like any other provider.
var agent = chatClient.AsAIAgent(new ChatClientAgentOptions
{
    Name = "docs-aware-agent",
    ChatOptions = new ChatOptions { Instructions = "You answer product questions. Cite sources." },
    AIContextProviders = new[] { rag },
});

var response = await agent.RunAsync("What's the return window on the XR-5 headphones?");

The three options worth knowing on day one:

SearchTime — BeforeAIInvoke (default) runs the search before every LLM call. OnDemand only runs when the LLM invokes a retrieval tool — useful when you want the LLM to decide whether to search, at the cost of an extra round-trip.
FunctionToolName — when set, exposes retrieval as a named tool the LLM can call directly (typically paired with SearchTime.OnDemand). Lets the LLM issue multiple targeted queries per turn instead of relying on one automatic pre-search.
ContextPrompt — the framing text prepended to the retrieved snippets before they reach the LLM. Default is generic; override it to match your domain (“The following are pricing FAQ entries…” reads better than the default).

Python has the same type under agent_framework.context_providers.TextSearchProvider; the constructor takes a search callable and an options object with the same field names. Full walkthroughs: RAG overview and the context providers journey.

The important takeaway: RAG is not a separate subsystem in MAF. It’s a context provider with a pre-built adapter. Everything you learned about registering, ordering, and composing providers applies to it unchanged.

Side-by-side differences
#

Aspect	Python	.NET
Base class	`agent_framework.ContextProvider`	`Microsoft.Agents.AI.AIContextProvider`
Override point	`async before_run(*, agent, session, context, state)`	`protected override ValueTask<AIContext> ProvideAIContextAsync(InvokingContext, CancellationToken)`
Sealed layer	None — `before_run` is the public hook	Public `InvokingAsync` is sealed; you override the protected `ProvideAIContextAsync`
Injecting instructions	`context.extend_instructions(source_id, text)`	`return new AIContext { Instructions = "..." }`
Injecting messages	`context.extend_messages([...])`	`new AIContext { Messages = new[] { ... } }`
Injecting tools	`context.extend_tools([...])`	`new AIContext { Tools = new AITool[] { ... } }`
Shared state	`state` dict passed into `before_run`	No built-in equivalent — use DI / captured fields
`source_id`	Required on `__init__` and every `extend_instructions` call	N/A — .NET dedupes on reference identity
Registration	`Agent(..., context_providers=[p1, p2, ...])`	`ChatClientAgentOptions.AIContextProviders = new[] { p1, p2, ... }`
RAG provider	`agent_framework.context_providers.TextSearchProvider`	`Microsoft.Agents.AI.TextSearchProvider`

Structurally the same shape. Python hangs extra composition ergonomics off the state dict and source_id; .NET stays closer to the DI/record idiom and leans on ValueTask for low-allocation async.

Gotchas
#

Python source_id is required on both sides. super().__init__(source_id=...) in the constructor and context.extend_instructions("your-id", text) in every call. Forget either and you get a runtime error at construction or on first run.
Override ProvideAIContextAsync, not InvokingAsync. On .NET, InvokingAsync is sealed because the framework needs to own the merge step. Overriding compiles-fails; this trips almost everyone on first contact with the API.
Provider instances are shared across runs. Don’t cache request-scoped data on the provider object itself — store it in the state dict (Python) or pass it through DI (.NET). The same provider instance serves every concurrent request; mutable fields leak between callers.
state is per-run, not per-session. A fresh dict on every turn. If you need cross-turn state, write to session.state in Python or a session-backed store in .NET — see Chapter 04 — Sessions.
Large instructions cost tokens. Every provider line is a system-prompt line, billed every turn. A RecentOrdersProvider that injects 50 orders in verbose JSON will eat half your context window. Trim, summarise, or make the provider conditional on the user’s question.
Order matters when providers read each other’s state. Python runs context_providers in list order; RecentOrdersProvider reading state["user"]["email"] only works if UserProfileProvider ran first. Register them in dependency order.

Tests
#

# Python — 3 unit + 1 integration. Unit tests drive CannedChatClient to
# assert the injected text reaches the options dict MAF passes to the model.
cd tutorials/05-context-providers/python
uv run pytest -v

# .NET — 3 unit + 2 integration (real LLM, per-user isolation).
cd tutorials/05-context-providers/dotnet
dotnet test tests/ContextProviders.Tests.csproj

9 tests total. The key assertions across both suites: (a) the provider’s contribution actually lands in the instructions the chat client receives before the LLM call (proving the merge works), and (b) two concurrent agents wired to different providers return answers that reference only their own user (proving the request-scoped isolation works — no leakage across instances).

How this shows up in the capstone
#

The tutorial runs one provider against hand-held fields. The capstone fans the same pattern out to three stacked providers against Postgres.

Python — three fine-grained providers in agents/python/shared/context_providers.py:

UserProfileProvider (lines 35–86) — reads the logged-in user from users; writes state["user"] and extends instructions with a one-line “Current user” header.
RecentOrdersProvider (lines 89–148) — reads the user’s last 5 orders from orders joined on users; writes state["recent_orders"] and extends instructions with a bulleted order list.
AgentMemoriesProvider (lines 151–207) — reads active rows from agent_memories ordered by importance; writes state["memories"] and extends instructions with a ## User Preferences & History block.
ECommerceContextProvider (lines 213–273) — a composite that runs all three in sequence and reassembles the legacy state["user_context"] string the custom tool loop needs. Specialists that only need a subset can register the fine-grained providers directly.

Wired into the orchestrator at agents/python/orchestrator/agent.py:94 via a single context_providers=[ECommerceContextProvider()] argument. Every request to the orchestrator goes through all three lookups before the LLM sees the first token. Identity comes from current_user_email (a ContextVar) — the provider never takes the user as a parameter.

.NET — equivalent shape in agents/dotnet/src/ECommerceAgents.Shared/ContextProviders/ContextEnricher.cs — reads the same three tables through Dapper, produces the same UserContext string so both stacks’ specialists see byte-identical system prompts given the same DB state. Tests: ContextEnricherTests.cs.

Tests for the composite chain in agents/python/tests/test_context_providers.py — confirms: (a) each individual provider runs and populates its own state key, (b) the composite assembles the legacy string in the order specialists expect, (c) unauthenticated / "system" callers short-circuit before hitting the DB.

Nothing in this pattern scales specially as the app grows. Add a fourth concern (a feature-flag provider, an A/B-bucket provider, a cart-contents provider) and you add a fourth class in the list. The orchestrator factory does not change.

What’s next
#

Chapter 06 — Middleware keeps climbing the request pipeline. Context providers run before the LLM call to inject context; middleware wraps the call itself — logging, auth gates, tool-approval flows, PII redaction, short-circuits. Same pattern of small composable objects around a single agent call; different layer, different concerns.