Series note — Part of MAF v1: Python and .NET. Last of five orchestration chapters. The most autonomous pattern: you hand in a fuzzy goal and Magentic figures out which workers to engage, in what order, and when to stop.
Repo — Full runnable code: tutorials/16-magentic-orchestration. Clone,
cd tutorials/16-magentic-orchestration, follow the per-language sections.
Why this chapter#
Sequential (Ch12) knows the path. Concurrent (Ch13) runs them all. Handoff (Ch14) lets agents pick neighbours. Group Chat (Ch15) uses a manager to schedule speakers off a conversation. Magentic does strictly more work than any of those: the manager reasons over a facts ledger (what it knows, what it needs to find out, what it can guess) and a plan (bullet-point steps for the team), picks a worker, reads the response, re-scores progress, and either iterates or synthesises a final answer.
Reach for it when the shape of the task changes with what you learn. “Plan a product launch” might need the Researcher twice and the Marketer once, or the Marketer first and the Legal specialist only if Legal concerns surface. You don’t predetermine that. The manager does.
Same problem, two managers — Group Chat vs Magentic#
The cleanest way to feel the gap between Ch15’s Group Chat and Magentic is to point both at the same task and watch the routing diverge. Take a single prompt — “Review this draft launch announcement: <500 words about a new headphone>” — given to a panel of three workers: Researcher (verifies factual claims), Marketer (sharpens positioning and tone), Legal (flags claims that need disclaimers).
Group Chat (Ch15) with the round-robin manager:
# Ch15 — fixed schedule, fixed turn count
workflow = (
GroupChatBuilder()
.add_participants(researcher, marketer, legal)
.with_manager(RoundRobinGroupChatManager(max_rounds=3))
.build()
)
# Speaking order is deterministic: Researcher → Marketer → Legal,
# repeated until max_rounds is hit. Every worker gets the same airtime
# whether they have something to add or not. The Legal agent runs
# even on a draft with no compliance concerns; the Researcher runs
# again on round 2 even if its first pass surfaced everything.Magentic on the same prompt:
# Ch16 — manager re-decides every round based on what just happened
workflow = (
MagenticBuilder()
.participants(researcher=researcher, marketer=marketer, legal=legal)
.with_standard_manager(
chat_client=manager_client,
max_round_count=6,
max_stall_count=2,
)
.build()
)
# The manager builds a facts ledger ("the draft mentions ANC at 30dB,
# 30-hour battery, $349 price"), drafts a plan ("verify ANC claim,
# verify battery claim, check pricing language for compliance, sharpen
# CTA"), then routes:
# Round 1 → Researcher: verify ANC and battery numbers
# Round 2 → Legal: the draft uses "world's best" — flag for revision
# Round 3 → Marketer: replace "world's best" + tighten the CTA
# Round 4 → progress ledger says is_request_satisfied=true → finalise
# The Researcher never runs a second time. Legal runs because Magentic
# noticed the superlative; on a different draft it might not run at all.The Group Chat version is predictable, fair, and over-eager. The Magentic version is adaptive, lopsided, and stops when it’s done. That’s the trade-off in one paragraph: pick Group Chat when every voice should weigh in by policy (review boards, design critiques where representation matters); pick Magentic when efficiency and depth-on-demand matter more than fairness.
Two practical consequences:
- Token budget. Group Chat with three workers and three rounds = 9 worker calls. Magentic on the same task usually finishes in 3–5 rounds with one worker per round, but spends 3–5 manager planning calls on top. Net token spend is in the same ballpark; where Magentic wins is when only one worker actually had useful work to do.
- Reproducibility. Group Chat gives you the same speaking order every run. Magentic does not — the manager’s planning is non-deterministic (LLM call), so the routing varies. For tests, capture the conversation under fixed seeds or assert on the final answer rather than the route.
Heads-up on the .NET side. Magentic is Python-only in MAF v1.1. The official docs state plainly: “Magentic Orchestration is not yet supported in C#.” (learn.microsoft.com). Every other orchestration chapter (12–15) has matching C# runnable code. This one has a .NET stub that prints the status message. Treat the Python section as canonical.
Prerequisites#
- Completed Chapter 15 — Group Chat Orchestration. Magentic is best read as “Group Chat with a much stronger manager” — share the Ch15 mental model.
.envat the repo root withOPENAI_API_KEYor the Azure OpenAI trio. Magentic always runs real LLM calls; there is no offline mode.- Budget. A default run does 3–5 manager planning calls plus one worker call per round, capped at
max_round_count. Expect 8–20 LLM calls per task on the defaults. The example here caps it at 6 rounds / 2 stalls.
What you’ll learn#
- What the manager’s facts ledger and plan actually contain (prompts verbatim from the MAF source).
- How the progress ledger — five JSON fields the manager fills in every round — drives next-speaker selection and completion detection.
- The difference between
max_round_count(hard wall on total turns) andmax_stall_count(soft wall on consecutive unproductive turns). - What observable events you get:
magentic_orchestratorwithPLAN_CREATED/PROGRESS_LEDGER_UPDATED/REPLANNED, andgroup_chatwithGroupChatRequestSentEvent. - Why .NET code examples that claim to use
MagenticBuilderare invented — and what to check when Microsoft ships C# support.
The concept#
Two kinds of agents, one loop#
Magentic splits participants into two roles:
- Workers — your specialists (Researcher, Marketer, Legal). Plain agents, identical shape to every other orchestration chapter.
- Manager — either a
StandardMagenticManager(the default, a Magentic-One-style planner with fixed prompts) or a custom subclass ofMagenticManagerBase. The manager owns a separate LLM. It does not sit in the worker rotation. It plans, delegates, and synthesises — it never contributes content directly except at the final-answer step.
Every round, the manager does one of four things:
- Plan — on the first turn, build a facts ledger and a bullet plan.
- Assess — every subsequent round, produce a progress ledger that judges satisfaction, loop detection, forward progress, next speaker, and a specific instruction for that speaker.
- Delegate — route the instruction to the named worker and wait for the response.
- Finalise — when the progress ledger flags
is_request_satisfied=true, synthesise the final answer from the full conversation.
If progress stalls, a fifth path kicks in: reset and replan — clear the chat history, rewrite the facts ledger, rewrite the plan, start the inner loop over.
State machine#
The manager’s loop. max_round_count bounds total iterations of the inner loop. max_stall_count bounds how many consecutive rounds the progress ledger can say “no progress” or “stuck in a loop” before the manager resets the chat and replans.
What the facts ledger actually is#
The facts ledger is not a structured data record. It’s an LLM-generated assistant message produced by the manager’s first planning call, formatted with four fixed headings. The exact prompt MAF sends (from agent_framework_orchestrations/_magentic.py):
Before we begin addressing the request, please answer the following pre-survey to the best of your ability. Keep in mind that you are Ken Jennings-level with trivia, and Mensa-level with puzzles…
- Please list any specific facts or figures that are GIVEN in the request itself.
- Please list any facts that may need to be looked up, and WHERE SPECIFICALLY they might be found.
- Please list any facts that may need to be derived (e.g., via logical deduction, simulation, or computation).
- Please list any facts that are recalled from memory, hunches, well-reasoned guesses, etc.
Your answer should use headings:
- GIVEN OR VERIFIED FACTS
- FACTS TO LOOK UP
- FACTS TO DERIVE
- EDUCATED GUESSES
For the launch-brief task, the facts ledger the manager’s LLM produces looks roughly like:
1. GIVEN OR VERIFIED FACTS
- The product is an AI meal planner.
- A launch brief is required (short, persuasive summary).
2. FACTS TO LOOK UP
- Current market size and competitors for AI meal planners (sources: Gartner, Crunchbase).
- Relevant dietary and health regulations (FTC guidance on wellness claims).
3. FACTS TO DERIVE
- Target audience segmentation based on common AI meal-planner users.
4. EDUCATED GUESSES
- Key differentiators likely include personalisation and automatic grocery lists.
- Pricing model likely subscription ($5–15/month range).This ledger is stored internally on the StandardMagenticManager as self.task_ledger.facts (a Message), then concatenated with the plan and injected into the chat history so every subsequent worker turn sees it. You don’t write the headings. You don’t parse them. The manager treats it as free-form text for future LLM context.
What the plan actually is#
The plan is a second LLM assistant message, again free-form text. The manager’s LLM gets a team roster (each worker’s name/description) and this prompt:
Based on the team composition, and known and unknown facts, please devise a short bullet-point plan for addressing the original request. Remember, there is no requirement to involve all team members.
For the launch-brief task, a typical plan is:
- Ask the Researcher for one concrete market insight for AI meal planners.
- Ask the Marketer for a tagline grounded in that insight.
- Ask Legal for one regulatory or IP concern about the positioning.
- Synthesise a 3-paragraph launch brief using the gathered pieces.Two things to stare at:
- The plan is advisory, not executable. The manager does not run this as a script. On every subsequent round it re-reads the full chat history and re-decides who speaks next via the progress ledger. The plan is just prior context that nudges the LLM in a sensible direction.
- “no requirement to involve all team members” is load-bearing. Without that clause, the manager would feel obliged to call every worker once. With it, the manager will cut a worker out of the flow when the ledger says no regulatory check is needed.
Together, the facts ledger + plan form what the source calls a _MagenticTaskLedger. You can read it via manager.task_ledger.facts.text and manager.task_ledger.plan.text after a run.
The progress ledger — the workhorse#
The facts ledger and plan are produced once per outer-loop entry. The real per-round decisions come from the progress ledger, a JSON object the manager requests from its LLM on every inner-loop iteration. The request prompt asks five questions and enforces a strict JSON schema:
{
"is_request_satisfied": { "reason": "...", "answer": true },
"is_in_loop": { "reason": "...", "answer": false },
"is_progress_being_made": { "reason": "...", "answer": true },
"next_speaker": { "reason": "...", "answer": "researcher" },
"instruction_or_question": { "reason": "...", "answer": "Give one concrete insight about..." }
}That object drives three decisions:
- Termination.
is_request_satisfied.answer == true-> callprepare_final_answer, exit. - Stall bookkeeping. If
is_progress_being_made.answer == falseoris_in_loop.answer == true, incrementstall_count. Otherwise decrement it (but not below zero). Whenstall_count > max_stall_count, trigger a full reset and replan. - Delegation. Otherwise, send
instruction_or_question.answerto the worker named innext_speaker.answer.
You won’t usually print this in production code, but for this chapter it’s the single most useful thing to observe. Two event streams surface it: the raw magentic_orchestrator events carry a MagenticProgressLedger payload on PROGRESS_LEDGER_UPDATED, and the JSON lives verbatim in manager._progress_ledger between turns if you need to inspect it in a debugger.
max_round_count vs max_stall_count#
These are different things, confused a lot, and they trigger different cleanups.
| Budget | What it counts | What happens when tripped |
|---|---|---|
max_round_count | Total inner-loop iterations across the whole run. Every worker dispatch increments round_count. | Loop exits; manager writes a final answer from whatever it has. Conversation is not reset. No replan. |
max_stall_count | Consecutive rounds the progress ledger reported “no progress” or “in a loop”. Resets to zero on any productive round. | Triggers _reset_and_replan: chat history wiped, participant states reset, new facts ledger + new plan generated, outer loop restarts. |
Defaults in MAF v1.1: max_stall_count=3, max_round_count=None (unlimited), max_reset_count=None (unlimited). In the sample here we set both explicitly — max_round_count=6, max_stall_count=2 — because defaults let the manager run expensively on an open-ended prompt.
If max_reset_count is set and the manager keeps stalling, it eventually gives up and moves to the final-answer step regardless of is_request_satisfied.
Jargon to nail#
- Magentic / StandardMagenticManager — the pattern and its default manager implementation, modelled on Microsoft’s Magentic-One research system. “Magentic” is the orchestration pattern;
StandardMagenticManageris the shipping default. - Facts ledger — an LLM-generated assistant message with four headings (
GIVEN OR VERIFIED FACTS,FACTS TO LOOK UP,FACTS TO DERIVE,EDUCATED GUESSES). Stored on the manager astask_ledger.facts. Free-form text; not a structured record. - Plan — a second LLM-generated assistant message, a bullet-point outline of next steps. Stored on the manager as
task_ledger.plan. Re-evaluated but not re-generated every round. - Progress ledger (
MagenticProgressLedger) — the five-field JSON the manager produces every inner-loop iteration. Drives termination, stall bookkeeping, next-speaker selection. - Delegate (worker) — a participant the manager can send an instruction to. Named by
agent.name; described to the manager viaagent.description. max_round_count— hard cap on inner-loop iterations. Defaults to unlimited; set it aggressively for cost control.max_stall_count— soft cap on consecutive stalled rounds before a reset-and-replan. Default 3.max_reset_count— cap on total resets. Default unlimited. When set and tripped, the manager gives up and finalises.GroupChatRequestSentEvent— the event fired each time the manager dispatches to a worker. Itsparticipant_nameis the selected worker. Magentic shares this event type with Group Chat (Ch15) — both use the same group-chat base.MagenticOrchestratorEvent/ event types — Magentic-specific events:PLAN_CREATED(first plan),REPLANNED(after a stall reset),PROGRESS_LEDGER_UPDATED(every inner-loop round).- Stall detection — the mechanism that decides when to reset. Increments on any round where the progress-ledger LLM says progress stopped or a loop started; decrements on productive rounds. When it exceeds
max_stall_count, resets the chat and regenerates the ledger + plan. - Worker boundaries — the scope of what each worker agent should produce. Tight, one-output-per-worker instructions lead to crisper progress-ledger decisions; woolly workers cause looping.
Code walkthrough#
Source: python/main.py.
Participants and manager#
from agent_framework import Agent
from agent_framework.orchestrations import MagenticBuilder
from agent_framework_orchestrations._magentic import StandardMagenticManager
def researcher() -> Agent:
return Agent(
_default_client(),
instructions="You are a Market Researcher. Respond with one concrete market insight.",
name="researcher",
)
def marketer() -> Agent:
return Agent(
_default_client(),
instructions="You are a Marketer. Respond with one tagline or positioning sentence.",
name="marketer",
)
def legal() -> Agent:
return Agent(
_default_client(),
instructions="You are a Legal advisor. Respond with one regulatory or IP concern.",
name="legal",
)
def manager_agent() -> Agent:
return Agent(
_default_client(),
instructions=(
"You are a program manager coordinating a small team. "
"Decompose the user's task into concrete subtasks and route each to the "
"right specialist. Keep your reasoning tight."
),
name="magentic-manager",
)Four agents, three of them workers. Two rules for worker instructions:
- One output each. “One concrete insight”, “one tagline”, “one concern”. Narrow scope makes the manager’s progress-ledger decisions reliable — the LLM judges “does this worker’s response count as done?” far more accurately when the worker has a single deliverable.
namematches what the manager sees. The progress ledger LLM is asked"next_speaker": select from: researcher, marketer, legal. If you renameresearchertoResearchAgent_v2the LLM’s string output has to match exactly — the orchestrator logs a warning and selects the first participant as a fallback when it can’t match.
The manager is constructed the same way you’d construct any agent, with one extra step: wrap it in StandardMagenticManager and hand the manager to the builder.
def build_workflow():
manager = StandardMagenticManager(
agent=manager_agent(),
max_round_count=6,
max_stall_count=2,
)
return MagenticBuilder(
participants=[researcher(), marketer(), legal()],
manager=manager,
).build()API shortcut. You can skip the explicit
StandardMagenticManager(...)and passmanager_agent=...straight toMagenticBuilder:MagenticBuilder( participants=[researcher(), marketer(), legal()], manager_agent=manager_agent(), max_round_count=6, max_stall_count=2, ).build()Both forms are supported. Use the explicit manager when you want to override prompts or subclass
MagenticManagerBase; use the shortcut otherwise. The sample uses the explicit form for teaching clarity.
Observing the decisions#
The interesting events for this chapter are group_chat (delegations) and magentic_orchestrator (planning/ledger updates). The sample runner collects delegations and the final answer:
async def plan(task: str) -> tuple[list[str], str]:
workflow = build_workflow()
speakers: list[str] = []
final_messages: list[str] = []
async for event in workflow.run(task, stream=True):
etype = getattr(event, "type", None)
if etype == "group_chat":
data = getattr(event, "data", None)
if data and type(data).__name__ == "GroupChatRequestSentEvent":
pname = getattr(data, "participant_name", None)
if pname:
speakers.append(pname)
elif etype == "output":
payload = getattr(event, "data", None)
if isinstance(payload, list):
for item in payload:
text = getattr(item, "text", None)
if text:
final_messages.append(text)
return speakers, "\n\n".join(final_messages).strip()Three events-per-round shapes to know about:
event.type == "group_chat"withGroupChatRequestSentEvent— manager dispatched to a worker. Pullparticipant_nameto log the route.event.type == "magentic_orchestrator"withMagenticOrchestratorEvent— the manager changed state. Checkdata.event_typeagainstPLAN_CREATED,PROGRESS_LEDGER_UPDATED, orREPLANNED. OnPROGRESS_LEDGER_UPDATED,data.contentis aMagenticProgressLedgeryou canto_dict()for inspection.event.type == "output"— terminal event carrying the final assistant message list.
Example trace — the launch-brief task#
Running the canonical task:
uv run python tutorials/16-magentic-orchestration/python/main.py \
"plan a short launch brief for an AI meal planner"With debug logging on the orchestrator, a representative trace looks like this:
[manager] outer loop: building facts ledger (LLM call 1)
[manager] outer loop: drafting plan (LLM call 2)
[event] magentic_orchestrator PLAN_CREATED
[manager] inner loop round 1: creating progress ledger (LLM call 3)
[event] magentic_orchestrator PROGRESS_LEDGER_UPDATED
is_request_satisfied=false, in_loop=false, progress=true
next_speaker=researcher
instruction="Give one concrete market insight about AI meal planners."
[event] group_chat GroupChatRequestSentEvent participant_name=researcher
[worker] researcher responds: "73% of US households cook at home on weekdays..."
[manager] inner loop round 2: creating progress ledger (LLM call 4)
[event] magentic_orchestrator PROGRESS_LEDGER_UPDATED
next_speaker=marketer
instruction="Given that weekday-home-cooking insight, produce one tagline."
[event] group_chat GroupChatRequestSentEvent participant_name=marketer
[worker] marketer responds: "Dinner, decided. An AI planner..."
[manager] inner loop round 3: creating progress ledger (LLM call 5)
[event] magentic_orchestrator PROGRESS_LEDGER_UPDATED
is_request_satisfied=true
[manager] preparing final answer (LLM call 6)
[event] output list[Message] <-- final answerSix LLM calls — two for the task ledger, three progress-ledger assessments, one final-answer synthesis. Three rounds in the inner loop, two actual worker dispatches. The manager judged that Legal wasn’t needed for a short positioning brief and didn’t consult it; if the task had been “plan a regulatory-compliant launch brief”, the progress ledger would have selected legal at some round before satisfying.
Final output (truncated):
Here's a concise launch brief for your AI meal planner:
Tired of deciding what to cook every weeknight? Our AI meal planner takes the
decision out of dinner. Plan a week of meals in under a minute, generate a
shopping list that adapts to your dietary needs, and spend the time you saved
actually eating.
Key Benefits
- Personalised meal suggestions for your preferences, health goals, and pantry
- Automatic, editable shopping lists tied to your chosen store
- Respects dietary restrictions and allergies
...The trace and the delegation order will vary across runs — that’s the point of Magentic. Determinism isn’t the goal; adaptive decomposition is.
What the same task would do with stalling#
Stalling triggers when workers repeat themselves or the progress ledger judges recent turns don’t move the task forward. A contrived trace where the Researcher is asked the same question twice:
round 1 next=researcher response: "AI meal planners are a growing market."
round 2 next=researcher response: "AI meal planners are a growing market." -- identical
progress_ledger.is_in_loop = true stall_count = 1
round 3 next=researcher response: "AI meal planners grow 15% yearly."
progress_ledger.is_progress_being_made = false stall_count = 2
round 4 next=researcher response: "It's a growing space."
progress_ledger.is_progress_being_made = false stall_count = 3 -- > max_stall_count=2
[manager] RESET: clearing chat_history, reset_count=1
[manager] replan: updating facts ledger + new plan with "overcome prior challenges"
[event] magentic_orchestrator REPLANNED
[manager] inner loop restartedNote: max_stall_count=2 means stall cleanup triggers when stall_count exceeds 2 — so the third consecutive unproductive round causes the reset, not the second. Source check: if self._magentic_context.stall_count > self._manager.max_stall_count. Set max_stall_count=1 if you want aggressive cleanup.
Microsoft’s docs are explicit:
“Magentic Orchestration is not yet supported in C#.” (learn.microsoft.com)
Confirmed against Microsoft.Agents.AI.Workflows 1.1.0 — no MagenticBuilder, no StandardMagenticManager, no MagenticOrchestratorEvent. The chapter’s dotnet/ project is therefore an informational stub, not a runnable sample:
public static class Program
{
public static void Main()
{
Console.WriteLine("Chapter 16 — Magentic Orchestration");
Console.WriteLine();
Console.WriteLine("Magentic orchestration is Python-only in Microsoft Agent Framework v1.1.");
// ... prints a pointer to the Python sample ...
}
}cd tutorials/16-magentic-orchestration/dotnet
dotnet build # succeeds — no MAF workflow package needed for the stub
dotnet run # prints the status messageWhen C# support lands, the shape will almost certainly mirror the other orchestration builders in the repo:
// Hypothetical — NOT currently valid. Do not copy into production.
var workflow = AgentWorkflowBuilder
.BuildMagentic(new[] { researcher, marketer, legal }, managerAgent,
options: new MagenticOptions { MaxRoundCount = 6, MaxStallCount = 2 });Until then, any .NET Magentic sample you find online that references MagenticBuilder / StandardMagenticManager is either targeting a preview branch or invented. Check against the shipping assembly before trusting.
Side-by-side differences#
| Aspect | Python | .NET |
|---|---|---|
| Orchestration supported? | Yes — agent_framework.orchestrations.MagenticBuilder | No — confirmed not in Microsoft.Agents.AI.Workflows 1.1.0 |
| Builder | MagenticBuilder(participants=..., manager_agent=..., max_round_count=..., max_stall_count=...).build() | Not available |
| Standard manager | StandardMagenticManager(agent=..., max_round_count=..., max_stall_count=..., max_reset_count=...) | Not available |
| Custom manager | Subclass MagenticManagerBase; override plan, replan, create_progress_ledger, prepare_final_answer | Not available |
| Delegation event | group_chat event with GroupChatRequestSentEvent payload | Not available |
| Orchestrator event | magentic_orchestrator with MagenticOrchestratorEvent (PLAN_CREATED / REPLANNED / PROGRESS_LEDGER_UPDATED) | Not available |
| Human-in-the-loop plan review | enable_plan_review=True on the builder; MagenticPlanReviewRequest / MagenticPlanReviewResponse | Not available |
Gotchas#
- Token cost is real. Defaults (
max_stall_count=3, no round cap) let a single run cost 20+ LLM calls on an open-ended task. The manager’s ledger call runs every round; the worker call runs every round; neither is cached. Setmax_round_count=6andmax_stall_count=2for interactive demos. - Facts ledger and plan are text, not structured data. Don’t parse them. Read them for context and log them for humans; don’t try to extract a JSON plan from the plan message.
- Progress ledger JSON can fail parsing.
StandardMagenticManagerretries up toprogress_ledger_retry_counttimes with backoff. If parsing keeps failing, the manager triggers a reset. Expect occasional retry warnings in logs on noisy LLM providers. max_stall_countis an “exceeds” threshold.stall_count > max_stall_counttriggers reset. Settingmax_stall_count=2means the third consecutive unproductive round triggers — not the second. Usemax_stall_count=1for very aggressive cleanup.- Worker
nameis matched exactly. The progress ledger LLM emitsnext_speakeras a plain string. If it doesn’t match a participant name, MAF logs a warning and falls back to the first participant. Keep names short and stable (researcher, notresesarcher-v3-experimental). - Worker
descriptionaffects manager quality. The manager sees each worker asname: description. A vague description ("helps with things") leads the LLM to pick the wrong speaker. Write crisp role sentences. - Manager quality dominates final output quality. A weak manager agent with vague instructions (“help coordinate”) produces noisy plans and worse progress ledgers than workers can recover from. Invest prompt effort here first.
- No native .NET support. Rewriting your orchestrator around Magentic today commits you to Python for that service. Factor that into architecture decisions — the rest of the capstone keeps Python and .NET in parity, but this chapter doesn’t.
Tests#
Python only — there is no runnable .NET test surface for this chapter.
# 1 wiring + 2 real-LLM integration tests
source agents/.venv/bin/activate
python -m pytest tutorials/16-magentic-orchestration/python/tests/ -v
# 3 passed (runtime ~60s — multiple real LLM calls per test)The integration tests assert:
- The workflow builds without raising.
- On a real task, the manager produces a substantive final answer (> 50 chars) — proving the loop exited via
is_request_satisfiedrather than a round-count timeout with an empty conversation. - On a broader task mentioning market context, a tagline, and a regulatory note, the delegation log either includes at least one known worker name or is empty (the manager is allowed to decide no delegation was needed). We do not assert all three workers were called — the manager genuinely has discretion here, and flaky “must call all three” tests were the first thing a Magentic suite would generate.
Tests run real OpenAI/Azure calls; they skip if credentials aren’t in .env.
When Magentic is the right shape#
Reach for Magentic when:
- The decomposition is task-dependent. Two user prompts that superficially look similar (“plan a product launch” vs “plan a regulated product launch”) should run very different worker sequences. Handoff or Group Chat can’t express that without bespoke routing logic.
- You want the manager to judge completion. The progress ledger’s
is_request_satisfiedgives you a native “are we done?” signal grounded in the actual conversation — not a round counter you set in advance. - You can afford the token spend. 8–20 LLM calls per task is routine. At cheap-model prices and interactive latencies this is fine; for high-throughput batch workloads it’s expensive.
Prefer Group Chat (Ch15) when the manager only needs to pick a speaker from a fixed set each round — no replanning, no ledger, no stall detection. Prefer Handoff (Ch14) when agents choose their neighbour rather than a manager. Prefer Sequential (Ch12) when the flow is known. Prefer Concurrent (Ch13) when workers are independent and you aggregate.
How this shows up in the capstone#
Not in the Phase 7 refactor. Candidate for a follow-up “research assistant” or “concierge” specialist in the e-commerce app — cases where the orchestrator genuinely can’t predict whether it needs to consult Product Discovery once, or iterate with Pricing-Promotions until a deal is found. Flagged as future work because: (a) the cost model has to be justified per user-facing interaction, and (b) .NET parity is blocked on Microsoft shipping C# Magentic support.
Further reading#
- Canonical README: tutorials/16-magentic-orchestration
- MAF docs — Magentic orchestration
- MAF docs — Group Chat orchestration (shares architecture with Magentic)
- Magentic-One — AutoGen’s original paper / reference implementation
What’s next#
- Previous: Chapter 15 — Group Chat Orchestration
- Next chapter: Chapter 17 — Human-in-the-Loop

