Skip to main content

MAF v1 — Handoff Orchestration (Python + .NET)

Nitin Kumar Singh
Author
Nitin Kumar Singh
I build enterprise AI solutions and cloud-native systems. I write about architecture patterns, AI agents, Azure, and modern development practices — with full source code.
MAF v1 — Handoff Orchestration (Python + .NET)
MAF v1: Python and .NET - This article is part of a series.
Part 14: This Article

Series note — Part of MAF v1: Python and .NET. Third orchestration pattern after Sequential and Concurrent. Next: Group Chat.

Repo — Full runnable code for this chapter is at https://github.com/nitin27may/e-commerce-agents/tree/main/tutorials/14-handoff-orchestration. Clone the repo, cd tutorials/14-handoff-orchestration, and follow the per-language instructions below.

Why this chapter
#

Sequential (Ch12) hard-codes the order. Concurrent (Ch13) runs everyone in parallel and then aggregates. Both predetermine the graph. Handoff is the first orchestration where the agents themselves decide what happens next. A Triage agent reads the user’s question, decides “this is math” or “this is history”, and invokes a tool call whose sole effect is to transfer control to a specialist. The specialist can finish the turn — or hand back to Triage if the question was mis-routed, or to a third specialist if the conversation evolves.

That shape is the backbone of customer-support bots (“billing question → billing specialist, account question → account specialist, escalation → human”), tutoring systems, and research assistants that pull in domain experts on demand. It’s also the shape that requires the most care, because agent-driven routing is where loops emerge: Triage hands to Math, Math decides “this is history actually” and hands back, Triage hands to History, History hands to Math, and the LLM budget evaporates. This chapter builds the happy path, shows exactly how the loop pathology happens, and names the mechanism that keeps it bounded — turn_limits.

Jargon defined inline below: HandoffBuilder (Python builder), AgentWorkflowBuilder.CreateHandoffBuilderWith (.NET builder), mesh topology, synthesised handoff_to_<name> tools, handoff_sent event, autonomous mode, turn_limits.

Prerequisites
#

  • Completed Chapter 13 — Concurrent Orchestration.
  • .env at the repo root with either OPENAI_API_KEY or the Azure OpenAI trio (AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_KEY, AZURE_OPENAI_DEPLOYMENT). The Python and .NET samples both run three real LLM calls end-to-end — unlike Ch09/Ch10 this chapter has no offline mode.
  • uv for Python; .NET 10 SDK for the .NET sample.

The concept
#

Mesh topology, not pipeline
#

Sequential is a line. Concurrent is a star. Handoff is a mesh: every agent in the participants list is a node, and every configured handoff is a directed edge. The framework synthesises one tool per outgoing edge and advertises it to that agent’s LLM — a synthesised handoff_to_<name> tool is a function the framework injects into the agent’s tool list at build time, with a JSON schema derived from the target agent’s description, and whose sole side-effect when invoked is to transfer control of the workflow to the target. Agents don’t build the graph; they navigate it.

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor': '#2563eb','primaryTextColor': '#ffffff','primaryBorderColor': '#1e40af', 'lineColor': '#64748b','secondaryColor': '#f59e0b','tertiaryColor': '#10b981', 'background': 'transparent'}}}%% flowchart LR classDef core fill:#2563eb,stroke:#1e40af,color:#ffffff classDef external fill:#f59e0b,stroke:#b45309,color:#000000 classDef success fill:#10b981,stroke:#047857,color:#ffffff classDef error fill:#ef4444,stroke:#b91c1c,color:#ffffff classDef infra fill:#64748b,stroke:#334155,color:#ffffff user([User question]) triage[Triage agent
start node] math[Math specialist] history[History specialist] out([Final answer]) user --> triage triage -- "handoff_to_math" --> math triage -- "handoff_to_history" --> history math -- "handoff_to_triage" --> triage history -- "handoff_to_triage" --> triage math --> out history --> out class triage core class math,history core class user infra class out success

Three agents, four handoff edges, two “answer” exits. The forward edges from Triage are how routing happens; the back edges from the specialists are what enables cycles — and why turn_limits exist.

Key properties of the mesh:

  • Every outgoing edge becomes a tool. WithHandoffs(triage, [math, history]) makes two tools appear on Triage’s tool list: handoff_to_math_tutor and handoff_to_history_tutor. Triage never sees the graph; it sees two function signatures.
  • The edges are directional. If you declare triage -> math but not math -> triage, Math has no handoff tool to call — it must either answer or fall through to the user-input pause.
  • Agents with no outgoing edges must terminate. Without any handoff_to_* tool the LLM has exactly one legal output shape: a plain assistant message. That message ends the turn.
  • Mesh is not all-to-all. You configure edges explicitly in .NET. Python’s HandoffBuilder defaults to all-to-all if you never call add_handoff(...), but as soon as you call it once the defaults disappear and you have to configure every source.

Why loops happen, and what stops them
#

The mesh has cycles by construction: once a specialist has a back-edge to Triage, Triage has forward edges to every specialist, and every LLM decision is non-deterministic. Nothing in the graph stops a pathological run where the agents keep passing the baton to each other because each one is slightly unsure whether the question is really in its domain.

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor': '#2563eb','primaryTextColor': '#ffffff','primaryBorderColor': '#1e40af', 'lineColor': '#64748b','secondaryColor': '#f59e0b','tertiaryColor': '#10b981', 'background': 'transparent'}}}%% flowchart LR classDef core fill:#2563eb,stroke:#1e40af,color:#ffffff classDef error fill:#ef4444,stroke:#b91c1c,color:#ffffff classDef infra fill:#64748b,stroke:#334155,color:#ffffff q([User question
ambiguous: math or history]) t1[Triage turn 1] m1[Math turn 1] t2[Triage turn 2] m2[Math turn 2] halt[turn_limits tripped
framework ends run] q --> t1 t1 -- "handoff_to_math" --> m1 m1 -- "handoff_to_triage" --> t2 t2 -- "handoff_to_math" --> m2 m2 -. "would hand back" .-> halt class t1,m1,t2,m2 core class halt error class q infra

A realistic pathological loop. Each handoff consumes a turn; turn_limits is the budget that stops the run when an agent would exceed its cap. Without that budget the workflow just keeps billing the LLM.

Turn limits is a per-agent budget on how many times that agent can be invoked in a single run. In Python it’s a kwarg on with_autonomous_mode:

.with_autonomous_mode(
    agents=[triage, math, history],
    turn_limits={triage.name: 3, math.name: 2, history.name: 2},
)

In .NET the handoff builder exposes it implicitly through the interactive request/response loop — each RunStreamingAsync(...) call is one “batch” of turns, and the outer caller is responsible for bounding the number of batches. Both languages have the same failure mode without a cap: the LLM-driven cycle keeps consuming tokens until something outside the framework (quota, timeout, Ctrl-C) stops it.

Pick limits based on the minimum path length you actually need. For the two-specialist mesh above, triage: 3, math: 2, history: 2 is generous: Triage can route, get a mis-routed hand-back, route again; each specialist gets two shots at answering.

Autonomous vs interactive mode
#

Handoff is the only orchestration in this series that is interactive by default. The reason is structural: when an agent doesn’t hand off, it has produced a message for the user — and the framework has no way to know whether the user wants to follow up. So instead of completing the workflow, the framework emits a request_info event and pauses, waiting for the caller to provide the next user turn.

Autonomous mode flips that. HandoffBuilder.with_autonomous_mode(...) tells the framework to fabricate a “continue” response on the agent’s behalf whenever it would otherwise pause — the agent keeps the turn and either answers again or hands off. This is what most tutorials demo because you can’t have an interactive CLI inside pytest. In production you usually want the interactive flow: the agent pauses, the user responds, the next batch of turns runs. .NET’s CreateHandoffBuilderWith defaults to interactive; the caller provides each user message via TrySendMessageAsync(new TurnToken(...)).

Jargon recap
#

  • HandoffBuilder (Python) — the fluent Python builder: HandoffBuilder(participants=[...]).with_start_agent(a).add_handoff(a, [b, c]).build(). Lives in agent_framework.orchestrations. Emits a regular Workflow you drive with workflow.run(...) / workflow.run_stream(...).
  • AgentWorkflowBuilder.CreateHandoffBuilderWith (.NET) — the equivalent .NET entry point. Returns a HandoffWorkflowBuilder on which you chain WithHandoffs(source, targets) calls and then Build(). Lives in Microsoft.Agents.AI.Workflows. Marked [Experimental] in 1.1 — public preview, stable-enough shape, suppressed analyzer MAAIW001 in the sample .csproj.
  • Mesh topology — the graph shape where every participant is a node and every configured handoff is a directed edge. Contrast with Sequential (line) and Concurrent (star).
  • Synthesised handoff_to_<name> tool — an AITool the framework injects into each agent’s tool list at build time, one per outgoing handoff edge. The <name> is the target agent’s Name (name= in Python, name: in AsAIAgent). The tool’s JSON schema is built from the target’s Description / description= so the source agent’s LLM sees what each handoff target specialises in.
  • Autonomous mode — a Python builder toggle (with_autonomous_mode(...)) that makes the workflow fabricate a “continue” user turn whenever an agent declines to hand off, so the graph runs end-to-end without a human in the loop. Experimental in 1.1. .NET doesn’t expose it as a builder flag; the equivalent is the interactive TurnToken loop in Program.cs.
  • turn_limits — a per-agent budget (Python: kwarg on with_autonomous_mode; .NET: enforced by the outer request/response loop) bounding how many times each agent can be invoked in one run. Prevents the agents-bouncing-back-and-forth failure mode the second diagram shows.
  • handoff_sent event (Python) — a stream event emitted by the workflow each time an agent successfully invokes a handoff tool. Payload is a HandoffSentEvent(source, target). The event is the observable equivalent of “an edge in the mesh just fired.”

Code walkthrough
#

Source: python/main.py. Three agents, one builder chain, one event-stream consumer that tracks routing.

from agent_framework import Agent
from agent_framework.orchestrations import HandoffBuilder


def triage() -> Agent:
    return Agent(
        _default_client(),
        instructions=(
            "You are a Triage agent. Read the user's question and hand off to the "
            "right specialist: math for arithmetic/math questions, history for "
            "historical facts or dates. If the specialist answers, simply acknowledge "
            "and stop — do not rewrite the answer."
        ),
        name="triage",
    )


def math_expert() -> Agent:
    return Agent(
        _default_client(),
        instructions=(
            "You are a Math expert. Answer arithmetic and math questions directly "
            "with a single short sentence containing the numerical answer."
        ),
        name="math",
    )


def history_expert() -> Agent:
    return Agent(
        _default_client(),
        instructions=(
            "You are a History expert. Answer historical questions in one short "
            "sentence with the specific date or year."
        ),
        name="history",
    )


def build_workflow():
    t, m, h = triage(), math_expert(), history_expert()
    return (
        HandoffBuilder(participants=[t, m, h])
        .with_start_agent(t)
        .add_handoff(t, [m, h])
        .add_handoff(m, [t])   # specialists can hand back to triage
        .add_handoff(h, [t])
        .with_autonomous_mode(
            agents=[t, m, h],
            turn_limits={"triage": 3, "math": 2, "history": 2},
        )
        .build()
    )

Four details that matter:

  • Agents are factories, not singletons. triage() builds a fresh Agent each time. Sharing a ChatClient across factories is fine (it’s stateless); sharing the Agent itself across concurrent runs is a surprise waiting to happen if any middleware is stateful.
  • name= becomes the executor id inside the workflow. That’s what shows up on executor_id in stream events and what you key turn_limits on. Keep them short and stable.
  • Instructions are load-bearing. Triage’s “simply acknowledge and stop — do not rewrite the answer” clause is what keeps it from taking over after the specialist replies. The specialists’ “in one short sentence” keeps them from rambling into a new handoff trigger. Drop these and you’ll watch the routing go feral.
  • with_autonomous_mode(...) is experimental. Without it the workflow pauses on request_info between agent turns, waiting for user input — fine for a chat UI, wrong for the scripted tests we run here. The turn_limits kwarg is the only thing standing between a demo and a runaway.

Reading the event stream
#

workflow.run(...) streams events as each agent runs. For Handoff, the interesting shapes are output (carrying streaming AgentResponseUpdate deltas per agent) and handoff_sent (one event per successful handoff tool call):

async def ask(question: str) -> tuple[list[str], str]:
    workflow = build_workflow()
    current: str | None = None
    buffers: list[tuple[str, list[str]]] = []
    handoffs: list[str] = []

    async for event in workflow.run(question, stream=True):
        etype = getattr(event, "type", None)
        eid = getattr(event, "executor_id", "") if etype == "output" else None

        if etype == "output" and eid in {"triage", "math", "history"}:
            if current != eid:
                current = eid
                buffers.append((eid, []))
            update = getattr(event, "data", None)
            text = getattr(update, "text", None) if update is not None else None
            if text:
                buffers[-1][1].append(text)

        elif etype == "handoff_sent":
            data = getattr(event, "data", None)
            target = getattr(data, "target", None)
            if target:
                handoffs.append(target)

    turns = [(eid, "".join(parts).strip()) for eid, parts in buffers if any(parts)]
    participants = [eid for eid, _ in turns]
    final = turns[-1][1] if turns else ""
    return participants, final

Two things worth pinning down:

  • output events are streaming deltas, not complete messages. event.data is an AgentResponseUpdate — the .text field holds the fragment emitted in this chunk, not the whole assistant turn. Aggregating consecutive events from the same executor gives you each agent’s full message.
  • handoff_sent is the audit trail. Every time an agent calls a handoff_to_* tool successfully, one handoff_sent event fires with source/target. That’s the trace you ship to Aspire to prove which routing decisions actually happened.

Running it:

uv run python tutorials/14-handoff-orchestration/python/main.py "What is 37 * 42?"
# Q: What is 37 * 42?
# Routing: triage → math
# A: 37 multiplied by 42 is 1,554.

uv run python tutorials/14-handoff-orchestration/python/main.py "When did World War 2 end?"
# Q: When did World War 2 end?
# Routing: triage → history
# A: World War 2 ended in 1945.

Two real LLM calls per question: Triage decides, specialist answers. The Routing: line is built from the executor_id of each agent that produced output — same data you’d get from handoff_sent events if you were shipping to telemetry instead of printing.

Source: dotnet/Program.cs. Same three agents, same mesh, but .NET leans into the interactive pattern: the sample starts the workflow, streams one batch of turns, and exits — the same shape a chat UI would use between user messages.

using Microsoft.Agents.AI;
using Microsoft.Agents.AI.Workflows;
using Microsoft.Extensions.AI;
using OpenAI.Chat;

using ChatMessage = Microsoft.Extensions.AI.ChatMessage;

ChatClient chatClient = BuildChatClient(); // OpenAI or Azure OpenAI

AIAgent triage = chatClient.AsAIAgent(
    instructions: TriageInstructions,
    name: "triage_agent",
    description: "Routes questions to the appropriate specialist.");
AIAgent mathTutor = chatClient.AsAIAgent(
    instructions: MathInstructions,
    name: "math_tutor",
    description: "Specialist agent for math and arithmetic questions.");
AIAgent historyTutor = chatClient.AsAIAgent(
    instructions: HistoryInstructions,
    name: "history_tutor",
    description: "Specialist agent for historical questions, dates, and events.");

Workflow workflow = AgentWorkflowBuilder.CreateHandoffBuilderWith(triage)
    .WithHandoffs(triage, new[] { mathTutor, historyTutor })
    .WithHandoffs(new[] { mathTutor, historyTutor }, triage)
    .Build();

Four things worth flagging before the runner code:

  • description: matters. The .NET builder derives the synthesised handoff_to_<name> tool’s JSON schema description from each target agent’s Description. A missing or generic description produces a schema the source agent’s LLM can’t route against — and routing quality collapses. Treat it as a prompt, not metadata.
  • WithHandoffs takes either direction. WithHandoffs(triage, [math, history]) says “triage can call handoff_to_math / handoff_to_history”; WithHandoffs([math, history], triage) says “math and history can both call handoff_to_triage_agent”. Both overloads are used in this sample.
  • CreateHandoffBuilderWith(triage) pins the start agent. The first agent to run is whichever one you pass here; the mesh around it is built with subsequent WithHandoffs calls.
  • The analyzer warning. CreateHandoffBuilderWith is marked [Experimental("MAAIW001")] in 1.1. TreatWarningsAsErrors=true (inherited from Ch12/Ch13) promotes it to a compile error. The sample .csproj adds <NoWarn>$(NoWarn);MAAIW001</NoWarn> so the rest of the warning budget stays strict. Shape is stable; the attribute is a ship-warning, not a stability warning.

The interactive run loop
#

.NET uses the same InProcessExecution.RunStreamingAsync + TurnToken pattern you saw in Ch13, but for Handoff the loop is meaningfully different: one RunStreamingAsync call drives one batch of turns — Triage runs, handoff fires, specialist runs, the workflow pauses on a WorkflowOutputEvent carrying the accumulated List<ChatMessage>. The caller then decides whether to feed another user message in for a follow-up batch.

var messages = new List<ChatMessage> { new(ChatRole.User, question) };
var routing = new List<string>();
string? lastExecutorId = null;
List<ChatMessage>? newMessages = null;

await using StreamingRun run = await InProcessExecution.RunStreamingAsync(workflow, messages);
await run.TrySendMessageAsync(new TurnToken(emitEvents: true));

await foreach (WorkflowEvent evt in run.WatchStreamAsync())
{
    switch (evt)
    {
        case AgentResponseUpdateEvent update:
            if (update.ExecutorId != lastExecutorId)
            {
                lastExecutorId = update.ExecutorId;
                routing.Add(update.ExecutorId ?? "agent");
                Console.WriteLine();
                Console.WriteLine($"[{update.ExecutorId}]");
            }
            Console.Write(update.Update.Text);
            break;

        case WorkflowOutputEvent output when output.Data is List<ChatMessage> list:
            newMessages = list;
            break;
    }
}

Three practical notes:

  • AgentResponseUpdateEvent is the streaming shape. One event per token, grouped by ExecutorId. Watch for the id transition to print the agent header exactly once per turn. (AgentResponseEvent exists and fires once per agent when the turn completes, if you prefer non-streaming.)
  • WorkflowOutputEvent.Data is a List<ChatMessage>, the conversation so far including the user turn you sent in. That’s the value you append future user messages to for the next batch.
  • TurnToken(emitEvents: true) is not optional. Without the token the workflow sits wired-but-idle and WatchStreamAsync() yields nothing. Same gotcha as Ch13.

Running all together:

cd tutorials/14-handoff-orchestration/dotnet
dotnet run -- "What is 37 * 42?"
# Q: What is 37 * 42?
#
# [triage_agent]
# This is a math question. I'll route it to the math tutor.
#
# [math_tutor]
# 37 multiplied by 42 is 1,554.
#
# Routing: triage_agent -> math_tutor
# Final : 37 multiplied by 42 is 1,554.

Side-by-side — Python vs .NET
#

AspectPython.NET
Builder entry pointHandoffBuilder(participants=[...]).with_start_agent(t).build()AgentWorkflowBuilder.CreateHandoffBuilderWith(t).Build()
Declare outgoing edges.add_handoff(source, [targets]).WithHandoffs(source, new[] { targets })
Declare incoming edgesSame .add_handoff(target, [source]) flipped.WithHandoffs(new[] { sources }, target) overload
Default handoff tool namehandoff_to_<agent_name> synthesised per edgehandoff_to_<agent_name> synthesised per edge
Pause-for-user behaviourEmits request_info event with HandoffAgentUserRequest payloadWorkflowOutputEvent returned to caller; caller decides whether to resume
Autonomous flag.with_autonomous_mode(agents=..., turn_limits={...}, prompts={...})Not exposed as a builder flag; emulated via the outer caller’s run loop
Per-agent turn budgetturn_limits={name: int} on autonomous modeOuter run loop bounds number of RunStreamingAsync calls
Observe handoffshandoff_sent event with source/target on event.dataNo dedicated event; infer from AgentResponseUpdateEvent.ExecutorId transitions
Per-token streamingoutput events carry AgentResponseUpdate with .text fragmentsAgentResponseUpdateEvent carries Update.Text fragments
Per-turn completionexecutor_completed with list[AgentExecutorResponse]AgentResponseEvent per agent
Final surfaceWorkflowOutputEvent-analogue (type == "output") with conversation payloadWorkflowOutputEvent.Data as List<ChatMessage>
Stability markerAutonomous mode is experimental; core builder is notCreateHandoffBuilderWith / WithHandoffs marked [Experimental("MAAIW001")]

Structurally the languages agree. The sharpest divergence is where the pause-for-user boundary lives: Python raises it as an explicit event inside the same run(...) call, so autonomous mode is a builder toggle that fabricates responses; .NET returns control to the caller at the same boundary, so “autonomous” is whatever outer loop the application builds. Neither is wrong; they’re different defaults for the same state machine.

Gotchas
#

  • Every source agent needs an outgoing edge list — even if that list is empty. If you skip WithHandoffs(math, [...]) entirely, Math has no handoff tool and must answer. That’s sometimes what you want, but the Python builder will warn math has no handoff targets to remind you.
  • description on each agent is the handoff prompt. The synthesised handoff_to_<name> tool’s JSON schema description comes from the target’s Description (.NET) / description= (Python). A generic “specialist agent” string routes badly. Write these as if they were a prompt line, because they are.
  • Turn limits are mandatory in autonomous mode. Without turn_limits={...}, the Python workflow will happily cycle triage -> math -> triage -> math until you hit a quota error. Pick limits that match your minimum viable path length plus one.
  • handoff_sent fires on successful handoff, not on attempt. If the LLM emits a malformed tool call, MAF surfaces that as a function-calling error, not a handoff event. Treat handoff_sent as “routing actually happened.”
  • output events in Python are deltas. event.data is an AgentResponseUpdate, not a string. Use .text for the fragment, aggregate consecutive events from the same executor_id to reconstruct each turn. Copying Ch10’s data-event pattern here gives you partial strings.
  • .NET AgentResponseUpdateEvent also streams deltas. Same aggregation discipline applies. Track ExecutorId transitions to cut between agents.
  • TurnToken(emitEvents: true) forgotten = silent hang (.NET). RunStreamingAsync returns a StreamingRun in a wired-but-not-started state. The TurnToken is what dispatches the first superstep.
  • Experimental markers stay experimental. HandoffBuilder.with_autonomous_mode in Python and CreateHandoffBuilderWith/WithHandoffs in .NET are public preview in 1.1. Shape is fine for tutorials and internal tools; pin your MAF version in production so a minor upgrade doesn’t rename a method.
  • Don’t share a single agent across workflows. Handoff agents are mutated at build time (the synthesised handoff tools are injected into their tool list). Build a fresh agent per workflow instance; reuse the ChatClient.
  • Specialist instructions want a “stop when done” clause. Without one the specialist will gratefully take every follow-up from the user that wanders into view, even if it’s no longer in their domain. The sample uses “in ONE short sentence” plus “Do not hand off back unless the question is clearly not about math” — both matter.

Tests
#

Python ships 1 wiring test plus 3 real-LLM integration tests. Integration tests skip when no LLM credentials are in .env.

# Python (4 tests)
source agents/.venv/bin/activate
python -m pytest tutorials/14-handoff-orchestration/python/tests/ -v
# 4 passed (3 hit real Azure OpenAI)

The tests exercise:

  • Wiringbuild_workflow() returns a non-null Workflow with the three participants and mesh edges.
  • Math routing — “What is 37 * 42?” lands in the Math specialist and the answer contains 1554 (comma-tolerant).
  • History routing — “When did World War 2 end?” lands in the History specialist and the answer contains 1945.
  • Divergence — math and history questions routed in the same test session produce distinct participant sets, proving the mesh is actually routing and not always picking the same agent.

.NET:

cd tutorials/14-handoff-orchestration/dotnet
dotnet build                             # type-check and compile
dotnet run -- "What is 37 * 42?"         # end-to-end with real LLM
dotnet run -- "When did World War 2 end?"

The .NET build pins Microsoft.Agents.AI.Workflows 1.1.0, Microsoft.Agents.AI 1.1.0, Microsoft.Agents.AI.OpenAI 1.1.0, and Azure.AI.OpenAI 2.1.0 — same as Ch12/Ch13. The .csproj adds <NoWarn>$(NoWarn);MAAIW001</NoWarn> to silence the experimental attribute; TreatWarningsAsErrors=true stays in force for everything else.

How this shows up in the capstone
#

agents/python/orchestrator/agent.py ships a hand-rolled router today: one call_specialist_agent(agent_name, message) tool that the orchestrator’s LLM invokes, with the orchestrator deciding which specialist to name and then calling the A2A endpoint over HTTP. That’s effectively a single giant handoff tool where the LLM picks the target from a string enum — awkward to prompt and impossible to audit without custom logging.

Phase 7 plans/refactor/10-orchestrator-to-handoff.md replaces it with HandoffBuilder. The orchestrator becomes the start agent; each specialist (ProductDiscovery, OrderManagement, PricingPromotions, ReviewSentiment, InventoryFulfillment) is a participant with explicit edges: orchestrator -> every specialist, every specialist -> orchestrator. A2A over HTTP remains the wire transport — Handoff drives orchestration in-process; A2A moves messages between service boundaries when a specialist runs in a different container.

The .NET parity port in agents/dotnet/src/ECommerceAgents.Orchestrator/ follows the same trajectory via AgentWorkflowBuilder.CreateHandoffBuilderWith(orchestrator).WithHandoffs(...). The refactor plan calls this chapter as the pattern reference and cites the mesh-plus-turn-limits guidance above for the loop-prevention strategy in production.

Further reading
#

MAF v1: Python and .NET - This article is part of a series.
Part 14: This Article

Related