MAF v1 — Human-in-the-Loop (Python + .NET)

MAF v1: Python and .NET - This article is part of a series.

Part 17: This Article

Part 18: MAF v1 — State and Checkpoints (Python + .NET)

Part 19: MAF v1 — Declarative Workflows (Python + .NET)

Part 20: MAF v1 — Workflow Visualization (Python + .NET)

Part 21: MAF v1 — E-commerce repo guided tour (Python + .NET)

Part 22: MAF v1 — Python ↔ .NET asymmetries: a porting reference

Part 23: MAF v1 — Evaluation framework (Python + .NET)

Part 24: MAF v1 — Prompt engineering (Python + .NET)

Part 25: MAF v1 — Deployment with Docker and Compose

Part 20.1: MAF v1 — DevUI for agents and workflows (Python)

Part 20.2: MAF v1 — Production hardening (Python + .NET)

Series note — Part of MAF v1: Python and .NET. First of four advanced chapters after the five orchestration patterns (Sequential, Concurrent, Handoff, Group Chat, Magentic). Next: State and Checkpoints.

Repo — Full runnable code for this chapter is at https://github.com/nitin27may/e-commerce-agents/tree/main/tutorials/17-human-in-the-loop. Clone the repo, cd tutorials/17-human-in-the-loop, and follow the per-language instructions below.

Why this chapter
#

Some decisions shouldn’t be autonomous. Approving a refund over $500. Confirming a return shipping label. Selecting which of three draft emails to send. A customer-support bot that cancels a subscription the moment an LLM feels 70% confident about the intent is a liability, not a product.

MAF’s answer is human-in-the-loop (HITL): an executor pauses the workflow, emits a request event carrying a unique request_id, and the caller is responsible for collecting a human answer and calling back with {request_id: answer}. The framework routes the response to the right handler and resumes the workflow from where it stopped. No checkpointing infrastructure, no separate durable-functions runtime — just two calls to workflow.run() and one correlation id.

The rest of the chapter is about that correlation id: where it comes from, how you pair it with the answer, and what goes wrong when you don’t.

Jargon defined inline below: request_info, response_type, @response_handler (Python) / [ResponseHandler] (.NET), request_id, RequestPort (.NET), WorkflowContext[T, U] (Python-only).

Prerequisites
#

Completed Chapter 16 — Magentic Orchestration. Handoff (Ch14) also pauses for the user, but that pause is synthesised by the HandoffBuilder — Ch17 is the underlying primitive.
uv for Python; .NET 10 SDK for the .NET sample.
No LLM required. HITL is framework plumbing. The guessing-game example runs offline, which is why this chapter’s tests are deterministic.

The concept: the caller’s two-call dance
#

The one idea to pin down is the caller’s perspective. From the executor’s side, ctx.request_info(...) looks like an await that suspends and later resumes with a value. From the caller’s side it is two distinct calls to workflow.run(...), separated by whatever time the human takes to answer, with a request_id threading them together.

Two workflow.run() calls, one request_id. The first call’s stream ends on request_info; the second call starts with responses={request_id: answer} and ends on output. Everything else — databases, UIs, queues, checkpoints — slots into the “out-of-band” middle.

The important properties of this dance:

Streams always terminate. The first call’s async iterator ends even though the workflow is not complete. That is the caller’s signal to stop waiting and go fetch the human answer.
Nothing else is running. Between call 1 and call 2, the workflow state sits in memory (or in a checkpoint, if you persisted it). No executor is executing.
The request_id is the contract. It is the only thing the caller needs to round-trip. Everything else in the request payload is advisory — the framework uses request_id alone to route the response back to the right @response_handler.
Responses are keyed, not ordered. You can have multiple pending requests in one workflow; responses={id_a: x, id_b: y} resumes them both.
The out-of-band middle is yours. The framework doesn’t care what you do between call 1 and call 2. Push the pause to a queue, store the request_id in Redis, render a UI card, email a human, call a Slack webhook — all valid. The only constraint is that whatever resumes the workflow holds the same workflow object (or a checkpoint of it) and the correct request_id.

Jargon recap
#

request_info (Python) / RequestPort.Create<TReq, TResp>(id) (.NET) — the pause primitive. Python calls ctx.request_info(request_data=..., response_type=...) inline from any @handler. .NET builds a RequestPort as a first-class executor node and wires it into the workflow graph via WorkflowBuilder. Both surface the pause to the caller as a framework event.
response_type (Python) — the type the executor expects back from the human. Part of the signature ctx.request_info uses to match responses to handlers. In .NET the type is the second generic on RequestPort.Create<TRequest, TResponse>(id).
request_id — a framework-assigned unique id for this specific pause. The caller sees it on the pause event; the caller must pass it back on the resume call. Opaque to user code; never reuse.
@response_handler (Python) / [ResponseHandler] (.NET, source-generated) — the executor method that runs when a response arrives. Signature: (self, original_request, response, ctx) in Python. In .NET the same role is implicit — the RequestPort forwards the int response back to the executor’s HandleAsync method via a workflow edge.
WorkflowContext[T, U] (Python only) — the typed context passed to executors. T is the downstream message type (None here because the executor doesn’t forward to another node), U is the output type (str for the final verdict). The type parameters are load-bearing for MAF’s validator; see the gotchas.

Code walkthrough
#

Source: python/main.py. One executor, two methods, no LLM.

from dataclasses import dataclass

from agent_framework._workflows._executor import Executor, handler
from agent_framework._workflows._request_info_mixin import response_handler
from agent_framework._workflows._workflow_builder import WorkflowBuilder
from agent_framework._workflows._workflow_context import WorkflowContext


@dataclass(frozen=True)
class GuessRequest:
    """Payload the workflow sends out to the human."""
    prompt: str


class GuessingGame(Executor):
    def __init__(self, secret: int) -> None:
        super().__init__(id="guessing-game")
        self.secret = secret

    @handler
    async def start(self, prompt: str, ctx: WorkflowContext[None, str]) -> None:
        # Pause the workflow. The caller will see a request_info event
        # carrying a fresh request_id and the GuessRequest payload.
        await ctx.request_info(
            request_data=GuessRequest(prompt=prompt or "Pick a number 1-10:"),
            response_type=int,
        )

    @response_handler
    async def check(
        self,
        request: GuessRequest,
        guess: int,
        ctx: WorkflowContext[None, str],
    ) -> None:
        if guess == self.secret:
            await ctx.yield_output(f"correct! the number was {self.secret}")
        elif guess < self.secret:
            await ctx.yield_output(f"too low - secret was {self.secret}")
        else:
            await ctx.yield_output(f"too high - secret was {self.secret}")


workflow = WorkflowBuilder(start_executor=GuessingGame(secret=7)).build()

Four details worth pinning:

request_info is the pause. The call does not return a value; it suspends the executor. The next time this executor is scheduled, it will be via the @response_handler method, not the original @handler. Treat them as two halves of one turn.
response_type=int is how the framework validates the incoming response. If the caller passes responses={request_id: "five"} instead of 5, MAF rejects the resume.
@response_handler signature matters. The four parameters — self, request, response, ctx — are not optional. Drop request and you lose access to the original payload; drop the WorkflowContext[T, U] generics and MAF’s validator refuses to register the handler.
ctx.yield_output(...) ends the workflow for this run. The caller’s second stream will emit an output event carrying the string, then complete.

Driving the workflow — the caller side
#

The interesting code is in the driver, not the executor:

async def run_with_response(secret: int, guess: int) -> str:
    workflow = build_workflow(secret)

    # Call 1: workflow pauses on request_info. Consume the full stream so
    # the workflow's internal run state is cleanly idle before we resume.
    pending_request_id: str | None = None
    async for event in workflow.run("Pick a number 1-10:", stream=True):
        if pending_request_id is None and getattr(event, "type", None) == "request_info":
            pending_request_id = getattr(event, "request_id", None)

    assert pending_request_id, "expected a request_info event to pause the workflow"

    # Call 2: resume with the canned response. Runs to completion.
    outputs: list[str] = []
    async for event in workflow.run(responses={pending_request_id: guess}, stream=True):
        if getattr(event, "type", None) == "output":
            data = getattr(event, "data", None)
            if isinstance(data, str):
                outputs.append(data)
    return outputs[-1] if outputs else ""

The two rules that trip people up:

Drain the first stream fully. Breaking out of the async for the instant you see request_info leaves the workflow in a half-dispatched state. MAF rejects a concurrent workflow.run(...) call while the previous stream has unconsumed events. The cheapest fix is to let the iterator exhaust itself after you’ve captured the id.
Pass the same request_id string verbatim. The id is opaque — no parsing, no reformatting. Store it next to whatever UI state is holding the human’s decision.

Running it:

cd e-commerce-agents
source agents/python/.venv/bin/activate
python tutorials/17-human-in-the-loop/python/main.py
# Pick a number 1-10: 5
# too low - secret was 7

Source: dotnet/Program.cs. The .NET API has a different shape: instead of an inline ctx.request_info call, you create a RequestPort as a first-class node in the workflow graph, wire it into the builder, and use a single StreamingRun that you feed responses into as they come in.

using Microsoft.Agents.AI.Workflows;

internal enum NumberSignal { Init, Above, Below }

internal sealed class JudgeExecutor() : Executor<int>("judge")
{
    private readonly int _target;
    private int _tries;

    public JudgeExecutor(int target) : this() => this._target = target;

    public override async ValueTask HandleAsync(
        int guess,
        IWorkflowContext context,
        CancellationToken cancellationToken = default)
    {
        this._tries++;

        if (guess == this._target)
        {
            await context.YieldOutputAsync(
                $"correct! the number was {this._target} (after {this._tries} tries)",
                cancellationToken);
            return;
        }

        NumberSignal hint = guess < this._target ? NumberSignal.Below : NumberSignal.Above;
        await context.SendMessageAsync(hint, cancellationToken: cancellationToken);
    }
}

// Build the workflow. The request port is both the starting executor (it emits
// the first RequestInfoEvent when we kick the run off with NumberSignal.Init)
// and the downstream target of the judge, so the loop keeps pausing until the
// judge yields output.
RequestPort numberPort = RequestPort.Create<NumberSignal, int>("GuessNumber");
JudgeExecutor judge = new(target: 7);

Workflow workflow = new WorkflowBuilder(numberPort)
    .AddEdge(numberPort, judge)
    .AddEdge(judge, numberPort)
    .WithOutputFrom(judge)
    .Build();

Three differences from Python worth flagging:

RequestPort is a node. You see it in the graph. The edge judge -> numberPort is literally “when the judge emits a NumberSignal, hand it to the port, which pauses the workflow and emits RequestInfoEvent.” Python hides this by making ctx.request_info look like a local await.
Response routing is by edge, not by decorator. The port re-dispatches the int response along the edge numberPort -> judge, which invokes judge.HandleAsync(int, ...). There is no [ResponseHandler] in this sample because the judge’s single typed handler already accepts the response type. For cases where the request type and response type live on separate methods, .NET 1.1 generates a [ResponseHandler] partial method via a source generator.
WithOutputFrom(judge) pins the output-producing node. Without it, the workflow has no natural completion point; the judge’s YieldOutputAsync would succeed but the caller’s stream would keep waiting.

The caller loop — one stream, inline responses
#

.NET collapses the two Python calls into a single StreamingRun that you feed responses into inline:

await using StreamingRun run = await InProcessExecution
    .RunStreamingAsync(workflow, NumberSignal.Init);

await foreach (WorkflowEvent evt in run.WatchStreamAsync())
{
    switch (evt)
    {
        case RequestInfoEvent request:
            int guess = ReadGuessFromConsole(request);
            await run.SendResponseAsync(request.Request.CreateResponse(guess));
            break;

        case WorkflowOutputEvent output:
            Console.WriteLine(output.Data);
            return 0;
    }
}

Three practical notes:

Single stream, inline responses. The Python caller makes two run(...) calls because the Python stream terminates on request_info. The .NET WatchStreamAsync() keeps yielding events across pauses — you just call SendResponseAsync(...) in the middle and the loop continues. Same state machine, different iterator semantics.
request.Request.CreateResponse(guess) is the canonical way to build a response envelope. It carries the request_id under the hood so SendResponseAsync knows which pending request to resolve.
Multiple rounds are free. The sample’s judge sends NumberSignal.Above / Below back through the port on every miss, which re-emits RequestInfoEvent for the next guess. The caller loop pauses, reads another guess, sends it in, and keeps going. No extra bookkeeping; the framework threads the conversation.

Inside `RequestInfoEvent`
#

The port speaks a small, well-typed protocol. Each pause surfaces as a RequestInfoEvent with two load-bearing properties:

request.Request.Data — the payload you sent in via ctx.SendMessageAsync(NumberSignal.Above). You read it back with request.Request.TryGetDataAs<NumberSignal>(out var signal), which is how the sample prints “previous guess was too high” before prompting for the next round.
request.Request.CreateResponse(value) — constructs a response envelope carrying the (framework-managed) request_id and the typed response. SendResponseAsync uses that envelope to route the response along the RequestPort -> judge edge.

The caller never sees the request_id string directly. It’s an implementation detail of the envelope, which is fine because .NET’s single-stream model means you never have to persist the correlation id across process boundaries — the StreamingRun holds it. If you need durability (human takes a long lunch, process restarts), checkpoint the workflow. Ch18 picks that thread up.

Running it:

cd tutorials/17-human-in-the-loop/dotnet
dotnet run -- 7             # scripted: always guesses 7
# Chapter 17 - Human-in-the-Loop (guess the number 1..10)
#   -> sending scripted guess 7
# correct! the number was 7 (after 1 tries)

dotnet run                  # interactive: prompts each round until you win

Side-by-side — Python vs .NET
#

Aspect	Python	.NET
Pause primitive	`ctx.request_info(request_data, response_type)` from any `@handler`	`RequestPort.Create<TReq, TResp>(id)` as a graph node
Resume hook	`@response_handler` method with `(self, request, response, ctx)`	Edge from `RequestPort` to the executor; source-gen `[ResponseHandler]` when the types differ
Correlation id	`event.request_id` on the `request_info` event	`request.Request.CreateResponse(value)` carries it
Caller pattern	Two `workflow.run(...)` calls: first without `responses`, second with `responses={id: value}`	Single `StreamingRun.WatchStreamAsync()` loop with `SendResponseAsync(...)` inline
Multiple rounds	Make more `workflow.run(responses=...)` calls until you see `output`	Keep iterating the same stream; the port re-emits `RequestInfoEvent` on each pause
Event type (pause)	`WorkflowEvent` with `type == "request_info"`	`RequestInfoEvent` class
Event type (done)	`WorkflowEvent` with `type == "output"`	`WorkflowOutputEvent` class
Context generics	`WorkflowContext[T, U]` required on handler methods	`IWorkflowContext` (non-generic)

Structurally identical; the developer experience diverges on the event iterator. Python makes the stream terminate at each pause so the caller has to re-enter run(...). .NET keeps the stream open and lets the caller respond in place. Neither is wrong; they’re different defaults for the same pause/resume state machine.

Tool Approval UX — the forward-link from Ch06
#

Chapter 6 — Middleware showed the mechanism for gating destructive tool calls: a FunctionMiddleware that inspects context.function.name, short-circuits by setting context.result to a placeholder string, and records a pending-approval row. That chapter deliberately stopped at the gate itself and deferred the user-facing half to Ch17.

The HITL primitive above is that user-facing half. Same shape, different payload:

# Ch06: FunctionMiddleware sees a destructive tool call, records it as pending,
# returns a placeholder to the LLM.
# Ch17: the agent workflow emits a request_info event carrying a
# ToolApprovalRequestContent (C#) or a "function_approval_request" content
# (Python), and waits for the caller to come back with approve/reject.

# Agent-level API (from the MS docs):
result = await agent.run("Cancel order 9817 and refund the customer.")
for req in result.user_input_requests:
    # req.function_call.name == "cancel_order"
    # req.function_call.arguments == {"order_id": "9817"}
    show_ui_card(req)

# After the UI collects a yes/no:
approval = req.create_response(user_said_yes)            # or req.to_function_approval_response(True)
approval_msg = Message(role="user", contents=[approval])
final = await agent.run(approval_msg, session=session)

.NET’s equivalent wraps the AIFunction in ApprovalRequiredAIFunction at construction time, then checks response.Messages.SelectMany(m => m.Contents).OfType<FunctionApprovalRequestContent>() after each RunAsync(...):

AIFunction weather = AIFunctionFactory.Create(GetWeather);
AIFunction gated   = new ApprovalRequiredAIFunction(weather);

AIAgent agent = client.AsAIAgent(
    model: "gpt-4.1",
    instructions: "You are a helpful assistant.",
    tools: [gated]);

AgentSession session = await agent.CreateSessionAsync();
AgentResponse response = await agent.RunAsync("What's the weather in Amsterdam?", session);

var requests = response.Messages
    .SelectMany(m => m.Contents)
    .OfType<FunctionApprovalRequestContent>()
    .ToList();

// Show request.FunctionCall.Name / Arguments in the UI, collect yes/no.
var approved = new ChatMessage(ChatRole.User, [requests[0].CreateResponse(approve: true)]);
AgentResponse final = await agent.RunAsync(approved, session);

Two things this adds over Ch06:

The correlation is the FunctionApprovalRequestContent / user_input_requests entry. Same role as request_id in the raw HITL flow above — the thing you pair with the yes/no to resume.
No separate workflow.run() call. Agent-level tool approval lives on Agent.RunAsync, not Workflow.run. The loop-until-approved pattern is the same shape as the HITL caller loop; the API surface is one level up.

For the gate itself (interception, placeholder result, audit row), stay on Ch06. For the UI, resumption, and the end-to-end shape, this chapter is the reference. In production you almost always want both: Ch06’s middleware is the enforcement point that survives a careless prompt change; this chapter’s caller loop is what the frontend actually renders.

The agent-level API also composes with the raw request_info primitive. If an agent is running inside a workflow (Ch11), tool approvals surface as the same RequestInfoEvent / request_info events the guessing game emits. One caller loop handles both. That’s the property that makes this one chapter cover “approve a refund” and “pick a number” with the same code path.

End-to-end: a cancel-order request through both layers
#

The Ch06 middleware and the Ch17 caller loop are easier to internalize when you trace one concrete request all the way through. Here’s “user types Cancel order 9817 and refund the customer” as it actually flows:

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor': '#2563eb','primaryTextColor': '#ffffff','primaryBorderColor': '#1e40af', 'lineColor': '#64748b','secondaryColor': '#f59e0b','tertiaryColor': '#10b981', 'background': 'transparent'}}}%% sequenceDiagram autonumber participant U as User participant W as Web UI participant O as Orchestrator participant L as LLM participant M as ApprovalGateMiddleware
(Ch06) participant T as cancel_order tool U->>W: "Cancel order 9817" W->>O: POST /chat (turn 1) O->>L: agent.run("Cancel order 9817") L-->>O: tool call → cancel_order(order_id="9817") O->>M: FunctionInvocationContext M->>M: name in DESTRUCTIVE? yes M->>M: approvals[corr_id]? not yet M-->>O: context.result = "pending_approval:abc12345 …" O-->>L: tool result (placeholder) L-->>O: assistant text
"I've queued the cancellation; awaiting approval" O-->>W: response.user_input_requests = [FunctionApprovalRequestContent(corr_id=abc12345)] W-->>U: render approval card
"Cancel order 9817 — Approve / Reject" U->>W: clicks Approve W->>O: POST /chat (turn 2)
body = req.create_response(True) O->>L: agent.run(approval_msg, session=same_session) L-->>O: tool call → cancel_order(order_id="9817") O->>M: FunctionInvocationContext M->>M: approvals[corr_id]? yes M->>T: await call_next() T-->>M: {"success": true, "refund_amount": 129.99} M-->>O: context.result = real result O-->>L: tool result (real) L-->>O: assistant text
"Order cancelled, refund of $129.99 issued" O-->>W: streamed final response W-->>U: success card

Two HTTP turns to the orchestrator. The first one short-circuits at the middleware and surfaces an approval request. The second one carries the user’s True back as a FunctionApprovalResponseContent, the middleware’s approvals dict picks it up via the correlation id, and the real cancel_order runs.

What lives where:

Step 7–10 (the gate) — pure Ch06 territory. The FunctionMiddleware sees the destructive tool, looks up the approval, doesn’t find one, sets context.result to the placeholder string, and returns. The tool body never executes.
Step 11–13 (the surface) — this chapter’s territory. The agent’s response carries a FunctionApprovalRequestContent (or Python’s function_approval_request) in response.user_input_requests, which the UI renders as an interactive card.
Step 14–16 (the resume) — also this chapter. The UI builds an approval message via req.create_response(True), ships it to the orchestrator on the same session, and the agent re-runs the same tool call.
Step 17–22 (the green path) — back to Ch06. The middleware inspects approvals[corr_id], sees True, calls call_next(), and the real tool runs.

The same correlation id (abc12345 above) ties the two halves together. In the capstone it’s a uuid4()[:8] keyed on the destructive tool name and arguments — short enough to log and scan, unique enough to never collide. The approvals dict is keyed by it on both ends; in production it lives in Redis with a 10-minute TTL so abandoned approvals don’t pile up.

If you read Ch06 and Ch17 separately, the temptation is to think of them as alternatives (“middleware OR HITL”). The diagram above is the antidote: they’re the two halves of one mechanism. Ch06 is the enforcement (the tool can’t run without approval); Ch17 is the experience (the user can actually grant approval). Production needs both.

Gotchas
#

Don’t use from __future__ import annotations in Python modules that define response handlers. It turns type annotations into strings, which MAF’s validator can’t resolve — @response_handler falls back to “untyped” and the handler never fires. We had to strip it from Ch17’s main.py.
WorkflowContext[T, U] generics are required. A bare WorkflowContext on the ctx parameter is silently treated as “no generics,” and MAF’s type-routed dispatcher mis-matches the handler. The docs hint that generics are optional; in 1.1 they are not.
Every request type must have a matching @response_handler on the same executor. If you have two ctx.request_info(...) calls with different payload types, each needs its own handler. MAF warns on import if a request type has no handler; the response is then dropped at resume time.
Drain the first stream before the second call. Python rejects a second workflow.run(...) while the first iterator still has events buffered. Let the async for finish even after you’ve captured the request_id.
request_id is opaque and single-use. Don’t parse it, don’t log-and-replay it, don’t pass it to a sibling workflow. Store it next to the UI state that’s collecting the human answer, retrieve it on resume, throw it away after.
.NET RequestPort name is the id prefix. The string "GuessNumber" in RequestPort.Create<..., ...>("GuessNumber") becomes the port’s executor id and surfaces on telemetry. Make it meaningful and stable.
Checkpoints save pending requests. RequestPort pauses survive checkpointing — on restore, the framework re-emits the RequestInfoEvent so the caller can handle it. You can’t supply responses inline during RestoreAsync; you must observe the re-emitted event and call SendResponseAsync(...) as usual. Ch18 covers checkpoints; this chapter assumes in-memory pauses.
Agent-level tool approval is the same primitive. A FunctionApprovalRequestContent flows through the same HITL machinery as a raw request_info. Treat them the same architecturally — the API surface differs but the “collect, correlate, resume” dance does not.

Tests
#

Python ships 5 deterministic tests — no LLM, no network, no flakiness:

# From repo root, against the project venv:
cd agents/python
uv run python -m pytest ../../tutorials/17-human-in-the-loop/python/tests/ -v
# 5 passed in 0.51s

The tests cover:

Wiring — build_workflow(secret=42) returns a non-null workflow.
Correct guess — run_with_response(secret=7, guess=7) reports “correct” and contains 7.
Low guess — run_with_response(secret=7, guess=3) reports “too low”.
High guess — run_with_response(secret=7, guess=10) reports “too high”.
Pause semantics — the first workflow.run("Pick a number:", stream=True) call emits request_info and does not emit output. This is the property the whole chapter rests on; if it ever regresses the HITL primitive is broken.

.NET builds cleanly under TreatWarningsAsErrors=true and runs end-to-end:

cd tutorials/17-human-in-the-loop/dotnet
dotnet build                  # 0 warnings, 0 errors
dotnet run -- 7               # scripted
# Chapter 17 - Human-in-the-Loop (guess the number 1..10)
#   -> sending scripted guess 7
# correct! the number was 7 (after 1 tries)

dotnet run                    # interactive: prompts each round until you win

The .csproj pins Microsoft.Agents.AI.Workflows 1.1.0 — same version as Ch12/Ch13/Ch14.

How this shows up in the capstone
#

Phase 7 plans/refactor/09-return-replace-sequential-hitl.md adds a HITL approval gate to the Return/Replace workflow. When the order total exceeds RETURN_HITL_THRESHOLD (env-configurable, default $500), the workflow pauses via ctx.request_info and the orchestrator surfaces the pending request as a ConfirmReturnCard in the frontend SSE stream. The request_id round-trips through the UI state — the card stores it, the “Approve” / “Reject” button click posts {request_id, approved: true} to the orchestrator, and the orchestrator calls workflow.run(responses={request_id: approved}) on the suspended workflow. Approval resumes the discount-apply executor; rejection short-circuits to a cancellation message.

The Magentic plan-review feature (Ch16) uses the same primitive under the hood for optional plan confirmation before the manager starts delegating to workers.

The agent-level tool approval pattern (Ch06’s FunctionMiddleware + this chapter’s UX) shows up in the capstone as the gate on cancel_order, initiate_refund, and delete_account in the Order Management specialist. The gate short-circuits in middleware; the frontend renders the pending approval and posts the yes/no back through the same orchestrator endpoint.

MAF v1 — Human-in-the-Loop (Python + .NET)

Why this chapter
#

Prerequisites
#

The concept: the caller’s two-call dance
#

Jargon recap
#

Code walkthrough
#

Driving the workflow — the caller side
#

The caller loop — one stream, inline responses
#

Inside `RequestInfoEvent`
#

Side-by-side — Python vs .NET
#

Tool Approval UX — the forward-link from Ch06
#

End-to-end: a cancel-order request through both layers
#

Gotchas
#

Tests
#

How this shows up in the capstone
#

Further reading
#

Related

Why this chapter#

Prerequisites#

The concept: the caller’s two-call dance#

Jargon recap#

Code walkthrough#

Driving the workflow — the caller side#

The caller loop — one stream, inline responses#

Inside RequestInfoEvent#

Side-by-side — Python vs .NET#

Tool Approval UX — the forward-link from Ch06#

End-to-end: a cancel-order request through both layers#

Gotchas#

Tests#

How this shows up in the capstone#

Further reading#

Related

Why this chapter
#

Prerequisites
#

The concept: the caller’s two-call dance
#

Jargon recap
#

Code walkthrough
#

Driving the workflow — the caller side
#

The caller loop — one stream, inline responses
#

Inside `RequestInfoEvent`
#

Side-by-side — Python vs .NET
#

Tool Approval UX — the forward-link from Ch06
#

End-to-end: a cancel-order request through both layers
#

Gotchas
#

Tests
#

How this shows up in the capstone
#

Further reading
#