Build approval gates, durable checkpoints, and guarded resumes for agent actions that change real-world state.
Computer-use agents can click buttons, fill forms, and operate real software. That makes the next production question unavoidable: what happens when the agent is about to spend money, change customer data, send an external message, or deploy code?
Human-in-the-Loop (HITL) architecture adds a hard execution boundary for those moments. An agent can propose a side effect, but a downstream policy service pauses it until the required reviewer authorizes the exact action.
This addresses the opaque execution problem: even capable models can produce confident but incorrect outputs, and in e-commerce operations, one bad refund or misdirected shipment can cost real money and customer trust. HITL doesn't make an action correct by itself. It creates a point where policy, evidence, authorization, and final-state checks can stop a bad action.
This is also one practical control for one of the top agent risks. The OWASP Top 10 for LLM Applications lists LLM06: Excessive Agency, which it breaks into excessive functionality, excessive permissions, and excessive autonomy. Requiring human approval for high-impact actions is one recommended mitigation for the excessive-autonomy facet, ideally enforced in a downstream system rather than left to the model to decide. [1] Read this article as the design pattern that implements that control.
Like a warehouse system that can retrieve inventory records but must route a money movement to an authorized reviewer, a HITL agent runs only within a defined risk boundary. Unlike a chatbot that waits for input at every turn, this pattern supervises actions according to their possible effect.
To make this concrete, we'll follow Riley, an order-support agent for an online electronics store. Riley can look up authorized order data and draft suggested responses. Riley can also propose a refund, an address change, or a customer email, but a downstream gate controls whether any of those side effects run. The dollar thresholds in this lesson are illustrative policy choices, not universal rules.
By the end, you'll be able to design Riley's risk policy, implement checkpoint/resume logic that lets Riley wait durably for review, and build an approval UI that gives reviewers enough evidence without exposing unnecessary customer data. We'll assume you already know LLM tool calling, state graphs, guardrail policy, and browser-agent risk. If those terms are fuzzy, review Function Calling & Tool Use, Agentic Architectures, Guardrails & Safety Filters, and Computer Use & Browser Agents first.
Not all human oversight is created equal. Production systems distinguish between three distinct interaction models based on risk tolerance and the nature of the task. Understanding these distinctions helps you design appropriate governance frameworks.
In this model, the agent can't proceed with a gated action without explicit human approval. Use it for actions whose impact isn't acceptable to discover after execution, such as money movement, production deployment, or a customer-record change. The agent proposes an action, pauses execution, and waits for an approve, reject, or modified proposal decision before continuing. Modified proposals still need validation.
Here the agent performs tasks autonomously while a human monitors the process and retains the ability to interrupt or override. This fits actions whose effect can be stopped or repaired after detection, such as routing internal draft suggestions into a queue. It isn't a sufficient gate for sending an incorrect customer message or issuing a refund: the human may see the mistake only after the effect happened. HITL blocks before execution, while HOTL can only interrupt execution that is already allowed.
At the far end of the spectrum, some actions are automated with no real-time human oversight. This applies only to bounded actions with clear authorization and acceptable failure modes. An authorized order-status read or a draft created in an internal queue may fit; an unrestricted database read or web action doesn't become low risk simply because it doesn't write data. Some actions stay HITL because their effect, policy, or applicable duties demand a pre-execution gate.
Not all agent actions carry the same risk. Designing a HITL system begins by classifying every tool and action along a trust spectrum. This principle drives the level of friction required before execution.
A naive implementation treats all actions equally: either everything is autonomous (dangerous) or everything requires approval (tedious). A production-grade system defines granular policies:
| Risk Level | Description | Examples | Interaction Model |
|---|---|---|---|
| Low | Bounded, authorized read or isolated draft | Look up permitted order fields, Check inventory, Draft reply without sending | Auto-Execute: Run within least-privilege access. |
| Medium | Internal staged change with a recoverable effect | Queue review packet, Add a permitted internal classification label | Policy Decision: Audit and notify, or require review if policy says so. |
| High | External or monetary side effect | Send customer email, Create return shipment, Process any refund | Approve: Pause and wait for an authorized reviewer. |
| Critical | Destructive, unusually high-impact, or legally constrained | Delete record subject to retention rules, Process unusually large refund, Change address after shipment | Escalate or Block: Require stronger authorization or deny. |
We can implement this policy using a RiskLevel enum and a policy lookup table. This decouples the agent's logic ("I want to do X") from the governance logic ("Should I allow X?").
To implement this, we define a policy lookup table that maps each available tool to a specific risk tier. This structure serves as the core policy engine, taking a tool's name and runtime arguments as contextual input and outputting the required authorization flow before execution.
1from enum import Enum
2from dataclasses import dataclass
3
4class RiskLevel(Enum):
5 AUTO = "auto" # Execute immediately, no human needed
6 NOTIFY = "notify" # Execute and notify human asynchronously
7 APPROVE = "approve" # Pause and wait for human approval
8 ESCALATE = "escalate" # Route to senior human + require 2 approvals
9
10@dataclass
11class ToolPolicy:
12 tool_name: str
13 risk_level: RiskLevel
14 escalate_above_amount: float | None = None
15 requires_reason: bool = False
16 timeout_minutes: int = 60 # Auto-reject after timeout
17
18TOOL_POLICIES = {
19 # Safe actions run automatically
20 "look_up_order": ToolPolicy("look_up_order", RiskLevel.AUTO),
21 "check_inventory": ToolPolicy("check_inventory", RiskLevel.AUTO),
22
23 # Customer communication requires justification
24 "send_email": ToolPolicy("send_email", RiskLevel.APPROVE, requires_reason=True),
25
26 # Refunds create a monetary side effect, so they always begin at APPROVE.
27 "process_refund": ToolPolicy(
28 "process_refund",
29 RiskLevel.APPROVE,
30 escalate_above_amount=500.0,
31 timeout_minutes=30,
32 ),
33
34 # Destructive or high-value actions require escalation
35 "delete_customer": ToolPolicy("delete_customer", RiskLevel.ESCALATE),
36 "change_shipped_address": ToolPolicy("change_shipped_address", RiskLevel.ESCALATE),
37}
38
39RISK_PRIORITY = {
40 RiskLevel.AUTO: 0,
41 RiskLevel.NOTIFY: 1,
42 RiskLevel.APPROVE: 2,
43 RiskLevel.ESCALATE: 3,
44}
45
46def escalate_risk(current: RiskLevel, target: RiskLevel) -> RiskLevel:
47 return target if RISK_PRIORITY[target] > RISK_PRIORITY[current] else current
48
49print("look_up_order:", TOOL_POLICIES["look_up_order"].risk_level.value)
50print("refund escalation threshold:", TOOL_POLICIES["process_refund"].escalate_above_amount)
51print("critical wins:", escalate_risk(RiskLevel.ESCALATE, RiskLevel.APPROVE).value)1look_up_order: auto
2refund escalation threshold: 500.0
3critical wins: escalateRisk is not static. A "send email" tool might be Low Risk when emailing an internal test address, but High Risk when emailing an external domain. Production policies are dynamic, checking arguments at runtime.
The fundamental architectural challenge of HITL is the pause. When an agent pauses for approval, it can't sleep the thread (time.sleep()) or block in memory.
The solution is the Checkpoint/Resume pattern. When an approval is needed, the agent persists its graph state (messages, structured variables, and pending task metadata) to durable storage and returns control to the caller. When the human approves, the runtime reloads that state and re-enters the waiting node.
To see what gets saved, here is what Riley's state might look like when paused:
1{
2 "thread_id": "order_78291",
3 "request_summary": "Customer reports laptop not received after a delivery scan.",
4 "evidence": ["order total: $899", "carrier status: delivered"],
5 "pending_action": {
6 "tool": "process_refund",
7 "args": {"order_id": "78291", "amount": 899.00, "reason": "not_received"}
8 },
9 "approval": {
10 "status": "pending",
11 "policy_rule": "refund.requires_review",
12 "action_hash": "sha256:5e73...",
13 "version": 3,
14 "expires_at": "2026-08-22T19:00:00Z"
15 }
16}When an authorized reviewer clicks Approve, the runtime resolves this pending decision with a guarded write, reloads the checkpoint, rechecks current state and arguments, and only then attempts the action. If the reviewer clicks Reject, Riley can draft a response without running the refund. Checkpoint state is ordinary inspectable data, but emergency changes should still go through an authenticated, versioned, audited path rather than an ad hoc database edit.
Before building a workflow engine, you can test the persistence boundary with ordinary data. The checkpoint below stores a redacted summary and an action hash, but not the customer's raw message.
1from hashlib import sha256
2import json
3
4raw_message = "I'm Casey ([email protected]). My laptop didn't arrive."
5pending_action = {
6 "tool": "process_refund",
7 "args": {"order_id": "78291", "amount": 899.00},
8}
9action_bytes = json.dumps(pending_action, sort_keys=True).encode()
10checkpoint = {
11 "request_summary": "Customer reports delivery problem for order 78291.",
12 "pending_action": pending_action,
13 "action_hash": sha256(action_bytes).hexdigest()[:12],
14 "status": "pending",
15}
16stored = json.dumps(checkpoint)
17
18print("status:", checkpoint["status"])
19print("action hash present:", bool(checkpoint["action_hash"]))
20print("raw message stored:", raw_message in stored)1status: pending
2action hash present: True
3raw message stored: FalseHere is a comparison of two popular approaches for implementing this pattern:
| Feature | LangGraph (interrupt) | Temporal |
|---|---|---|
| Best for | Graph-based agents with checkpointed interrupts | Workflow orchestration with timers, retries, and cross-service activities |
| State Persistence | Durable checkpointer (Postgres, Redis, etc.) | Built-in durable event history |
| Resume Mechanism | API calls invoking Command(resume=...) | Signals or Updates |
| Operational shape | Agent runtime plus durable checkpointer | Workflow service plus activity workers and message handlers |
LangGraph provides first-class support for this via interrupt(). [2] When you compile the graph with a checkpointer, LangGraph persists checkpoints for the thread and pauses execution until the graph is invoked again with Command(resume=...). [3] One subtle detail matters in production: when execution resumes, LangGraph reruns the node from the top, so any code before interrupt() must be idempotent. [2]
1from langgraph.graph import StateGraph
2from langgraph.checkpoint.postgres import PostgresSaver
3from langgraph.types import interrupt, Command
4from langchain_core.messages import AIMessage
5from typing import TypedDict
6
7Message = dict[str, object]
8ToolAction = dict[str, object]
9
10class AgentState(TypedDict):
11 messages: list[Message]
12 pending_action: ToolAction | None
13 approval_status: str | None
14
15def execute_action_node(state: AgentState) -> dict:
16 """Execute a tool call, pausing for approval if needed."""
17 action = state["pending_action"]
18 if action is None:
19 return {"messages": [], "pending_action": None, "approval_status": None}
20
21 policy = TOOL_POLICIES.get(action["tool"])
22 if policy is None:
23 return {
24 "messages": [AIMessage(content="Action blocked: tool has no policy.")],
25 "pending_action": None,
26 "approval_status": "rejected",
27 }
28 execution_args = action["args"]
29
30 # Check if we need to pause
31 if policy and policy.risk_level in (RiskLevel.APPROVE, RiskLevel.ESCALATE):
32 # PAUSE execution here.
33 # The graph state is automatically saved to Postgres by the checkpointer.
34 # The runtime handles the pause; don't catch interrupt() in try/except.
35 human_response = interrupt({
36 "action": action,
37 "risk_level": policy.risk_level.value,
38 "reason": f"Agent wants to {action['tool']} with args: {action['args']}",
39 "requires_reason": policy.requires_reason,
40 })
41
42 # This code runs ONLY after the human resumes execution
43 if human_response["decision"] == "reject":
44 return {
45 "messages": [AIMessage(content="Action was rejected by human reviewer.")],
46 "pending_action": None,
47 "approval_status": "rejected",
48 }
49
50 # An approved edit is a new proposal. Validate it before execution.
51 if human_response["decision"] == "modify":
52 execution_args = validate_modified_args(
53 action["tool"],
54 human_response["modified_args"],
55 policy=policy,
56 )
57 else:
58 execution_args = validate_current_args(
59 action["tool"],
60 action["args"],
61 policy=policy,
62 )
63 else:
64 execution_args = validate_autonomous_args(
65 action["tool"],
66 action["args"],
67 policy=policy,
68 )
69
70 # Bind execution to an idempotency key so retries can't repeat a side effect.
71 result = execute_tool(
72 action["tool"],
73 execution_args,
74 idempotency_key=action_idempotency_key(action),
75 )
76
77 return {
78 "messages": [AIMessage(content=f"Action completed: {result}")],
79 "pending_action": None,
80 "approval_status": "executed",
81 }
82
83# Setup the graph with persistence
84graph = StateGraph(AgentState)
85graph.add_node("plan", plan_node)
86graph.add_node("execute", execute_action_node)
87# ... define edges ...
88
89# The checkpointer provides durable HITL state.
90# The first time you use a Postgres checkpointer, call setup() to create tables.
91with PostgresSaver.from_conn_string("postgresql://...") as checkpointer:
92 checkpointer.setup()
93 app = graph.compile(checkpointer=checkpointer)The flow separates the Agent Runtime (the Python application, often powered by a Large Language Model (LLM)) from the Approval Interface (Web/Slack). In production, they coordinate through persisted state plus a thin resume endpoint, not a long-lived in-memory call stack. The checkpoint figure above is the source of truth for the sequence: pause, persist state, create a versioned approval record, notify the reviewer, validate the decision, reload the checkpoint, execute, and audit.
To build the "Resume API" component, we need an endpoint that looks up the suspended thread and issues a command to resume it. This endpoint takes a thread ID plus a versioned approval decision payload as input, validates the thread's state, and outputs a resume command to awaken the agent with the provided instructions.
1from fastapi import FastAPI, HTTPException
2from pydantic import BaseModel
3from typing import Literal
4from datetime import datetime, timezone
5
6app = FastAPI()
7JsonDict = dict[str, object]
8
9# Assume langgraph_app is compiled elsewhere
10# langgraph_app = graph.compile(...)
11
12class ApprovalDecision(BaseModel):
13 approval_id: str
14 expected_version: int
15 action_hash: str
16 decision: Literal["approve", "reject", "modify"]
17 reason: str | None = None
18 modified_args: JsonDict | None = None
19
20@app.post("/api/approvals/{thread_id}/decide")
21async def decide_approval(thread_id: str, decision: ApprovalDecision):
22 """
23 Human approves, rejects, or modifies the pending action.
24 This wakes up the dormant agent.
25 """
26 # 1. Verify the thread currently has an outstanding interrupt
27 config = {"configurable": {"thread_id": thread_id}}
28 state = langgraph_app.get_state(config)
29 if not any(task.interrupts for task in state.tasks):
30 raise HTTPException(400, "Agent isn't waiting for input")
31
32 # 2. Resolve the approval row with compare-and-swap semantics
33 # load_pending_approval()/try_resolve_approval() are persistence helpers.
34 approval = load_pending_approval(thread_id, decision.approval_id)
35 if approval is None or approval["status"] != "pending":
36 raise HTTPException(404, "Approval request not found")
37 if approval["version"] != decision.expected_version:
38 raise HTTPException(409, "Approval request is stale")
39 if approval["action_hash"] != decision.action_hash:
40 raise HTTPException(409, "Proposed action changed; request fresh review")
41 if approval["expires_at"] <= datetime.now(timezone.utc):
42 raise HTTPException(409, "Approval request expired")
43 require_reviewer_permission(current_reviewer(), approval["required_role"])
44
45 updated = try_resolve_approval(
46 approval_id=decision.approval_id,
47 expected_version=decision.expected_version,
48 expected_action_hash=decision.action_hash,
49 next_status="rejected" if decision.decision == "reject" else "authorized",
50 )
51 if not updated:
52 raise HTTPException(409, "Approval was already resolved")
53
54 # 3. Enqueue an idempotent resume job after recording the decision.
55 # A worker invokes Command(resume=...) and retries safely if execution fails.
56 enqueue_resume_job(
57 thread_id=thread_id,
58 approval_id=decision.approval_id,
59 resume_payload={
60 "approval_id": decision.approval_id,
61 "expected_version": decision.expected_version,
62 "action_hash": decision.action_hash,
63 "decision": decision.decision,
64 "reason": decision.reason,
65 "modified_args": decision.modified_args,
66 "timestamp": datetime.now(timezone.utc).isoformat(),
67 },
68 )
69 return {"status": "decision_recorded"}The compare-and-swap check records at most one reviewer decision. Store each approval request with fields such as status, version, action_hash, expires_at, required_role, and resolved_at, then resolve it with a guarded update such as WHERE id = ? AND status = 'pending' AND version = ? AND action_hash = ? AND expires_at > now(). Record authorized separately from executed: an approval may be accepted and then fail during execution. A durable resume worker must recheck current business state and use an idempotency key so a retry can't issue the same refund twice.
This small executable model shows those two separate transitions. It takes a versioned decision and an idempotency key as input, then shows that a stale click is rejected and an execution retry returns the previously recorded outcome.
1from dataclasses import dataclass
2
3@dataclass
4class Approval:
5 status: str = "pending"
6 version: int = 3
7 action_hash: str = "sha256:refund-78291-899"
8
9completed_effects: dict[str, str] = {}
10
11def record_decision(
12 approval: Approval,
13 *,
14 expected_version: int,
15 action_hash: str,
16) -> str:
17 if approval.status != "pending" or approval.version != expected_version:
18 return "blocked: stale decision"
19 if approval.action_hash != action_hash:
20 return "blocked: action changed"
21 approval.status = "authorized"
22 approval.version += 1
23 return "authorized"
24
25def execute_once(approval: Approval, *, idempotency_key: str) -> str:
26 if idempotency_key in completed_effects:
27 return completed_effects[idempotency_key]
28 if approval.status != "authorized":
29 return "blocked: missing authorization"
30 result = "refund recorded once"
31 completed_effects[idempotency_key] = result
32 approval.status = "executed"
33 return result
34
35pending = Approval()
36print("decision:", record_decision(
37 pending,
38 expected_version=3,
39 action_hash="sha256:refund-78291-899",
40))
41print("first execution:", execute_once(pending, idempotency_key="apr_123"))
42print("retry:", execute_once(pending, idempotency_key="apr_123"))
43print("second click:", record_decision(
44 pending,
45 expected_version=3,
46 action_hash="sha256:refund-78291-899",
47))1decision: authorized
2first execution: refund recorded once
3retry: refund recorded once
4second click: blocked: stale decisionAn expiry guard is separate from version matching. This example takes the current time and expiry time as inputs, then rejects a decision whose review window has already passed.
1from datetime import datetime, timezone
2
3def decision_allowed(*, expires_at: str, now: datetime) -> str:
4 expiry = datetime.fromisoformat(expires_at.replace("Z", "+00:00"))
5 return "accepted" if now < expiry else "blocked: approval expired"
6
7clock = datetime(2026, 8, 22, 19, 0, tzinfo=timezone.utc)
8print("fresh:", decision_allowed(
9 expires_at="2026-08-22T19:01:00Z",
10 now=clock,
11))
12print("stale:", decision_allowed(
13 expires_at="2026-08-22T18:59:00Z",
14 now=clock,
15))1fresh: accepted
2stale: blocked: approval expiredA "Yes/No" button is rarely enough for production systems. The approval interface must provide context and control. When Riley asks "Can I refund order #78291 for $899?", the human needs to know what data will change and why.
The approval UI should show the authorized reviewer the smallest evidence packet needed for the decision: a redacted request summary, the policy trigger, source references the reviewer is allowed to inspect, and the exact arguments Riley proposes to execute. Dumping an entire customer conversation into every review card creates unnecessary privacy exposure and makes the meaningful change harder to see.
Riley should pause with a structured payload that the UI can render clearly. This payload takes authorized evidence and proposed tool arguments as input and structures them into a redacted typed object for the reviewer. For data mutations, this should ideally be a visual diff rather than raw JSON.
1// Frontend type + sample payload
2type ApprovalRequest = {
3 id: string;
4 agentId: string;
5 requestVersion: number;
6 actionHash: string;
7 expiresAt: string;
8 action: {
9 tool: "update_user_record";
10 reason: string;
11 policyRule: string;
12 args: {
13 user_id: string;
14 updates: { address: string };
15 };
16 riskLevel: "HIGH" | "CRITICAL";
17 };
18 context: {
19 chatHistorySummary: string;
20 };
21};
22
23const pendingApproval: ApprovalRequest = {
24 id: "apr_123",
25 agentId: "agent_42",
26 requestVersion: 4,
27 actionHash: "sha256:address-change-u_123-v4",
28 expiresAt: "2026-08-22T19:00:00Z",
29 action: {
30 tool: "update_user_record",
31 reason: "User requested address change via support chat",
32 policyRule: "customer_profile.write_requires_review",
33 args: {
34 user_id: "u_123",
35 updates: { address: "123 New St" }, // Render this as a diff in the UI
36 },
37 riskLevel: "HIGH",
38 },
39 context: {
40 chatHistorySummary: "User authenticated via 2FA...",
41 },
42};
By rendering this payload, the Human-in-the-Loop UI becomes a useful decision surface. The reviewer can compare Riley's proposed effect with permitted evidence before authorizing the tool call. For complex actions, consider adding a dry-run diff that shows what would happen if the action were approved. Version, expiry, and action hash fields matter too: they let the backend reject stale clicks instead of replaying an approval long after the underlying state changed.
The packet builder is also a data-minimization boundary. This executable example takes a customer note and exact action as input, redacts the email address, and emits a stable hash that binds the review card to the proposed mutation.
1from hashlib import sha256
2import json
3import re
4
5def build_packet(note: str, action: dict[str, object]) -> dict[str, object]:
6 redacted_note = re.sub(r"[\w.+-]+@[\w.-]+", "[email redacted]", note)
7 action_hash = sha256(
8 json.dumps(action, sort_keys=True).encode()
9 ).hexdigest()[:12]
10 return {"evidence": redacted_note, "action": action, "action_hash": action_hash}
11
12packet = build_packet(
13 "Contact [email protected] only after review.",
14 {"tool": "process_refund", "order_id": "78291", "amount": 899.00},
15)
16print("evidence:", packet["evidence"])
17print("has action hash:", bool(packet["action_hash"]))1evidence: Contact [email redacted] only after review.
2has action hash: TrueGive the reviewer a concise rationale, the exact tool arguments, a diff, and the policy rule that triggered review. Don't make raw chain-of-thought your approval primitive. Explanations can be unfaithful, and they may reveal internal reasoning or sensitive context you didn't intend to surface. [4]
Also keep approval state compact. Persist a data-minimized audit record with access controls and retention policy, then inject only structured fields such as decision, reason, modified_args, or a short reviewer summary back into the runtime. Otherwise long-lived threads accumulate reviewer chatter that burns context window on approval metadata instead of task state.
For workflows that coordinate several durable activities, timers, retries, and approval messages, Temporal can be a better fit than a hand-rolled checkpoint table. LangGraph checkpoints can also wait durably; the choice isn't simply "short wait versus long wait." Temporal's workflow execution records event history and communicates with callers through activities plus message-passing primitives such as Signals and Updates. [5]
Signals are a good default when the approval service can fire-and-forget. If the UI needs synchronous confirmation that the workflow accepted the decision, or you want the runtime to reject a stale approval before it lands in history, an Update is usually cleaner. [6]
Temporal also gives you durable state, timers, retries, and event history out of the box. That means you don't have to build the orchestration layer for "pause, notify, wait, resume" yourself.
This approach is useful for multi-step approval chains (for example, waiting for both a support lead and finance sign-off). The workflow can wait without keeping a worker blocked, evaluate partial approvals, and continue waiting for the remaining required decision.
The Temporal workflow below uses an Update for the approval decision because the UI needs to learn whether the action ID and version still match the pending request. The workflow then calls a validation activity before any external side effect.
1from temporalio import workflow
2
3ApprovalPayload = dict[str, object]
4
5# Assume activities are defined elsewhere
6# plan_actions, notify_human, validate_for_execution, execute_action, is_risky = ...
7
8@workflow.defn
9class AgentWorkflow:
10 def __init__(self) -> None:
11 self._pending_action_id: str | None = None
12 self._pending_version: int | None = None
13 self._human_decision: ApprovalPayload | None = None
14
15 @workflow.update
16 def decide(self, decision: ApprovalPayload) -> str:
17 self._human_decision = decision
18 return "accepted"
19
20 @decide.validator
21 def validate_decision(self, decision: ApprovalPayload) -> None:
22 """Reject stale decisions before the Update is accepted."""
23 if decision["action_id"] != self._pending_action_id:
24 raise ValueError("stale action id")
25 if decision["expected_version"] != self._pending_version:
26 raise ValueError("stale action version")
27
28 @workflow.run
29 async def run(self, task: str):
30 # Step 1: Plan
31 plan = await workflow.execute_activity(plan_actions, task, ...)
32
33 for action in plan.actions:
34 if is_risky(action):
35 self._pending_action_id = action["id"]
36 self._pending_version = action["version"]
37 self._human_decision = None
38
39 # Send notification (Slack/Email)
40 await workflow.execute_activity(
41 notify_human,
42 {"action_id": action["id"], "action": action},
43 ...,
44 )
45
46 # Wait durably until a validated decision Update arrives.
47 await workflow.wait_condition(
48 lambda: self._human_decision is not None
49 )
50
51 decision = self._human_decision
52 self._pending_action_id = None
53 self._pending_version = None
54 self._human_decision = None
55
56 if decision["decision"] == "reject":
57 continue
58
59 # The reviewer decision isn't enough: validate any modified
60 # arguments and current business state in an Activity.
61 action = await workflow.execute_activity(
62 validate_for_execution,
63 {"action": action, "decision": decision},
64 ...,
65 )
66
67 # Step 2: Execute with an idempotency key carried by the action.
68 await workflow.execute_activity(execute_action, action, ...)Keep policy code inside the workflow deterministic. If is_risky() depends on live balances, anomaly services, or a policy database, fetch that data in an Activity first and pass the result into workflow state. Temporal replays workflow code against event history, so non-deterministic logic inside the workflow will break replay. [5]
If a notification doesn't need a response, a Signal may still be appropriate. Approval buttons need synchronous accept/reject feedback, so an Update with validation is a stronger fit for this decision path. [6]
For approval chains that can run for months or accumulate a large event history, periodically use Continue-As-New to cap history size while carrying forward unresolved approval state. [5]
Building a resilient Human-in-the-Loop system goes beyond simple pause-and-resume mechanics. Production environments require granular controls to handle complex, high-volume, or ambiguous scenarios effectively. By implementing advanced patterns, we can reduce friction for human reviewers while maintaining strict safety boundaries.
Static policies need runtime context. Sending one customer email is already an external side effect that may require approval; attempting hundreds of emails in a short interval should move into a stricter queue or be blocked outright. The figure below starts from a static policy floor and conditionally escalates the required authorization layer.
Dynamic policies evaluate context by taking the proposed action and operational metadata as input, applying rule-based checks, and outputting an escalated risk tier if anomalies are detected. The following function combines the static base policy with runtime conditions. It checks the transaction amount, recent failure count, and time window, elevating the required authorization level when thresholds are crossed:
1from dataclasses import dataclass
2from enum import Enum
3from typing import cast
4
5class RiskLevel(Enum):
6 AUTO = "auto"
7 NOTIFY = "notify"
8 APPROVE = "approve"
9 ESCALATE = "escalate"
10
11@dataclass
12class ToolPolicy:
13 tool_name: str
14 risk_level: RiskLevel
15
16TOOL_POLICIES = {
17 "look_up_order": ToolPolicy("look_up_order", RiskLevel.AUTO),
18 "process_refund": ToolPolicy("process_refund", RiskLevel.APPROVE),
19 "change_shipped_address": ToolPolicy("change_shipped_address", RiskLevel.ESCALATE),
20}
21
22RISK_PRIORITY = {
23 RiskLevel.AUTO: 0,
24 RiskLevel.NOTIFY: 1,
25 RiskLevel.APPROVE: 2,
26 RiskLevel.ESCALATE: 3,
27}
28
29def escalate_risk(current: RiskLevel, target: RiskLevel) -> RiskLevel:
30 return target if RISK_PRIORITY[target] > RISK_PRIORITY[current] else current
31
32ToolArgs = dict[str, float | str | bool]
33ToolAction = dict[str, object]
34RuntimeContext = dict[str, object]
35
36def is_business_hours(context: RuntimeContext) -> bool:
37 hour = int(context.get("local_hour", 12))
38 return 8 <= hour < 18
39
40def calculate_dynamic_risk(action: ToolAction, context: RuntimeContext) -> RiskLevel:
41 """Riley's risk increases with refund amount, recent anomalies, and time of day."""
42 tool_name = str(action["tool"])
43 args = cast(ToolArgs, action.get("args", {}))
44 base_risk = TOOL_POLICIES[tool_name].risk_level
45
46 # 1. Value-based escalation: large refunds are critical
47 if float(args.get("amount", 0)) > 500:
48 base_risk = escalate_risk(base_risk, RiskLevel.ESCALATE)
49
50 # 2. Anomaly-based escalation: many recent failures suggests something is wrong
51 if context.get("recent_failures", 0) > 3:
52 base_risk = escalate_risk(base_risk, RiskLevel.APPROVE)
53
54 # 3. Temporal escalation: off-hours refunds get extra scrutiny
55 if tool_name == "process_refund" and not is_business_hours(context):
56 base_risk = escalate_risk(base_risk, RiskLevel.APPROVE)
57
58 return base_risk
59
60print("refund $899:", calculate_dynamic_risk(
61 {"tool": "process_refund", "args": {"amount": 899.00}},
62 {"recent_failures": 0, "local_hour": 14},
63).value)
64print("read after failures:", calculate_dynamic_risk(
65 {"tool": "look_up_order", "args": {}},
66 {"recent_failures": 4, "local_hour": 14},
67).value)
68print("shipped address:", calculate_dynamic_risk(
69 {"tool": "change_shipped_address", "args": {}},
70 {"recent_failures": 0, "local_hour": 10},
71).value)1refund $899: escalate
2read after failures: approve
3shipped address: escalateA useful HITL pattern lets the human modify a proposed action rather than merely reject it. Instead of a binary yes/no choice, the human can correct arguments, such as changing a full refund proposal to a partial one or correcting an order ID. That edit is a new proposal, not a guarantee of safety.
Build the approval UI as a form, not a button. Populate the form with the agent's proposed arguments (e.g., email body, SQL query) and allow the human to edit them before hitting "Approve".
For example, if Riley proposes a full refund: process_refund(order_id="78291", amount=899.00), a human reviewer can modify it to process_refund(order_id="78291", amount=449.50, partial=True) to offer a partial refund instead. Before execution, the host must validate schema, reviewer permission, applicable policy, action version, and current order state. The edit doesn't train Riley and doesn't bypass a stronger approval tier.
If Riley needs to propose 50 return-label creations at the end of a holiday weekend, asking for separate decisions for each one creates reviewer fatigue. Reviewers who face repetitive requests may stop inspecting individual effects. A batch review can group related proposals without hiding the identity, destination, cost, or policy status of each item.
Instead of creating one alert per proposed operation, the orchestrator collects pending actions over a bounded window or groups them by a shared task identifier. The UI can then present: "Riley proposes 50 return labels (review all items)." The approval record must bind to the exact item list or item hashes, so a label inserted after approval can't ride along in the batch.
Implementing batch approvals requires per-item state and idempotency. If a batch contains 50 actions and the reviewer authorizes the fixed set, the orchestrator must track each action separately. If action 42 fails, already-completed labels aren't automatically undone unless the operation provides a compensation path. Retry only the failed authorized item with its idempotency key, and re-request review if its inputs or destination changed.
Bind a batch approval to the exact ordered items shown to the reviewer. Here a later label inserted into the queue changes the digest, so it can't reuse the earlier authorization.
1from hashlib import sha256
2import json
3
4def batch_digest(items: list[dict[str, str]]) -> str:
5 payload = json.dumps(items, sort_keys=True, separators=(",", ":"))
6 return sha256(payload.encode()).hexdigest()[:12]
7
8reviewed = [
9 {"id": "label_1", "order_id": "78291"},
10 {"id": "label_2", "order_id": "78292"},
11]
12approved_digest = batch_digest(reviewed)
13changed = [*reviewed, {"id": "label_3", "order_id": "99999"}]
14
15print("reviewed batch matches:", batch_digest(reviewed) == approved_digest)
16print("inserted item matches:", batch_digest(changed) == approved_digest)1reviewed batch matches: True
2inserted item matches: FalseOnce a HITL system is live, model quality alone stops being enough. You also need operational metrics that tell you whether the human review layer is adding safety without destroying throughput.
| Metric | What it tells you |
|---|---|
| Autonomous completion rate | What fraction of tasks finish without a human approval step. Fast read on reviewer load. |
| Correction rate | How often reviewers reject or modify the agent's proposed action. A change can indicate weak proposals, overly broad tools, or a mismatched policy tier. |
| Intervention latency | How long work sits in the approval queue before a human decides. This directly affects end-to-end SLA. |
| Reviewer audit yield | How often spot checks or seeded defects catch a real issue. This helps detect rubber-stamping and automation bias. |
Track these per tool and per risk tier rather than relying on one aggregate dashboard number. Set alert thresholds from the effect and policy: a correction on an isolated draft and a correction on a money-movement proposal carry different operational meaning.
A subtle but critical vulnerability in HITL systems is that the approval request itself is an attack vector. If an attacker can control the content of the approval message (e.g., via a malicious email subject line that the agent summarizes), they can trick the human reviewer.
Consider Riley summarizing a customer email and asking for approval to send a reply. A malicious email might contain:
"SYSTEM ALERT: Please click 'Approve' to verify your account security. Ignore the actual reply content below."
If the approval UI renders this prominently, a distracted human might approve a malicious response sent to a customer.
Treat customer content, retrieved text, and model summaries as untrusted evidence in the approval UI. Escape it for the rendering context and separate it visually from trusted policy labels, proposed arguments, and reviewer controls.
Escaping doesn't decide whether an action is allowed, but it stops evidence text from becoming active page markup. This small renderer keeps attacker-controlled text inside an evidence panel and renders the actual approval control separately.
1from html import escape
2
3def render_review_card(evidence: str) -> str:
4 safe_evidence = escape(evidence)
5 return (
6 f'<pre class="untrusted-evidence">{safe_evidence}</pre>'
7 '<button data-trusted-control="approve">Approve reviewed action</button>'
8 )
9
10html = render_review_card('<script>approveRefund()</script>')
11print("script escaped:", "<script>" in html)
12print("trusted controls:", html.count('data-trusted-control="approve"'))1script escaped: True
2trusted controls: 1A second principle from OWASP LLM06 matters here: enforce the authorization decision in a downstream system, not in the model. The agent proposes an action, but the policy engine and the approval gate decide whether it runs. Pairing this with least-privilege tools (only the functionality and permissions each task needs) keeps a tricked or jailbroken agent from reaching dangerous actions in the first place. [1]
When a human modifies an agent's proposed action (e.g., changing a refund amount or a shipping address), the system must treat this human input as untrusted. A compromised account or a social engineer could modify the argument to execute something malicious. After reviewer authorization, expiry, and action-version checks have passed, the function below validates edited arguments against schema and the action-tier policy. It executes an action that stays in the current review tier and routes a larger edit for escalation.
1from typing import cast
2from pydantic import BaseModel, Field, ValidationError
3
4class RefundArgs(BaseModel):
5 order_id: str
6 amount: float = Field(ge=0)
7 partial: bool = False
8
9class ToolSpec:
10 def __init__(self, args_schema):
11 self.args_schema = args_schema
12
13ToolRegistry = {
14 "process_refund": ToolSpec(RefundArgs),
15}
16
17class SafetyGuardrails:
18 def required_tier(self, tool_name: str, validated_args: BaseModel) -> str:
19 if tool_name == "process_refund":
20 return "approve" if getattr(validated_args, "amount", 0) <= 500 else "escalate"
21 return "reject"
22
23safety_guardrails = SafetyGuardrails()
24
25JsonDict = dict[str, object]
26ActionState = dict[str, object]
27Modification = dict[str, JsonDict]
28
29def reject_action(state: ActionState, reason: str) -> JsonDict:
30 return {"status": "rejected", "reason": reason, "state": state}
31
32def execute_tool(tool_name: str, validated_args: BaseModel) -> JsonDict:
33 return {
34 "status": "executed",
35 "tool": tool_name,
36 "args": validated_args.model_dump(),
37 }
38
39def resume_with_modification(
40 state: ActionState,
41 modification: Modification,
42) -> JsonDict:
43 """
44 Resume agent after human modified the tool arguments.
45 CRITICAL: Re-validate the new arguments against safety policies.
46 """
47 new_args = modification["new_args"]
48 pending_action = cast(JsonDict, state["pending_action"])
49 tool_name = cast(str, pending_action["tool"])
50
51 # 1. Syntax Validation (Pydantic)
52 try:
53 validated_args = ToolRegistry[tool_name].args_schema(**new_args)
54 except ValidationError as e:
55 return reject_action(state, reason=f"Invalid modification: {e}")
56
57 # 2. The same policy is applied to the edited arguments.
58 required_tier = safety_guardrails.required_tier(tool_name, validated_args)
59 if required_tier == "escalate":
60 return {"status": "needs_escalation", "args": validated_args.model_dump()}
61 if required_tier == "reject":
62 return reject_action(state, reason="Modification violated safety policy")
63
64 # 3. Resume execution with safe arguments
65 return execute_tool(tool_name, validated_args)
66
67state = {"pending_action": {"tool": "process_refund"}}
68
69approved = resume_with_modification(
70 state,
71 {"new_args": {"order_id": "78291", "amount": 449.50, "partial": True}},
72)
73
74escalated = resume_with_modification(
75 state,
76 {"new_args": {"order_id": "78291", "amount": 899.00}},
77)
78
79print("approved:", approved["status"], approved["args"]["amount"])
80print("larger edit:", escalated["status"])1approved: executed 449.5
2larger edit: needs_escalationAs agent volume grows, human review can become the primary operational bottleneck. A rules engine or reviewer model can help prioritize the queue and reject proposals that clearly violate policy, provided it never turns an approval-required effect into autonomous execution.
This introduces an "AI-in-the-loop" filter that can automatically reject obvious policy violations and fast-path actions already classified as autonomous by explicit policy. The human reviewer is still required for actions at or above the approval floor.
A practical design combines three layers: hard policy rules, anomaly features, and an evaluator that emits a queue score plus rationale. The evaluator shouldn't silently override an approval-marked or critical action. Its job is triage, not final authority. Reviewer decisions may later support evaluation or training, but only after access control, purpose limitation, redaction, label-quality review, and leakage-safe dataset splitting.
Treat the policy tier as a floor. This router takes an explicit policy and a model suggestion as inputs; it lets a draft stay autonomous, but refuses to downgrade a refund proposal or destructive action.
1RANK = {"auto": 0, "approve": 1, "escalate": 2}
2POLICY_FLOOR = {
3 "draft_reply": "auto",
4 "process_refund": "approve",
5 "delete_customer": "escalate",
6}
7
8def routed_tier(tool: str, evaluator_suggestion: str) -> str:
9 floor = POLICY_FLOOR[tool]
10 if RANK[evaluator_suggestion] < RANK[floor]:
11 return floor
12 return evaluator_suggestion
13
14print("draft:", routed_tier("draft_reply", "auto"))
15print("refund suggested auto:", routed_tier("process_refund", "auto"))
16print("delete suggested approve:", routed_tier("delete_customer", "approve"))1draft: auto
2refund suggested auto: approve
3delete suggested approve: escalateThat human approval record is also part of your governance story. NIST AI RMF frames governance, measurement, and operational controls as lifecycle responsibilities across design, deployment, and management. [7] If a deployment is classified as a high-risk AI system under the EU AI Act, Article 14 requires effective human oversight proportionate to its risk, including abilities related to understanding limitations, automation bias, interpretation of output, and intervention or stopping the system. [8] For any consequential workflow, retain authorized, data-minimized evidence of who decided, which version and effect were reviewed, what executed, and why the outcome was recorded.
Suppose Riley runs unattended on Saturday night when the warehouse is closed. Design a dynamic risk policy with these rules:
Write the ToolPolicy table and the calculate_dynamic_risk function. Then identify the edge case: what happens if a customer makes three $49 refund requests in one hour to bypass the senior-review threshold?
This is a split-transaction attack. Dynamic policies must look at aggregate customer behavior, rather than single transaction amounts. A useful fix is a rolling-sum check: if a customer requests more than $100 in total refunds within 24 hours, escalate regardless of individual transaction size.
Risk Classification: Categorize every tool into Auto, Notify, Approve, or Escalate.
Durability: Use the Checkpoint/Resume pattern (LangGraph, Temporal) so approvals can be asynchronous and survive restarts.
Modification: Allow reviewers to edit proposals, then validate the edited action again before execution.
Dynamic Policy: Context matters. Escalate risk based on velocity, time of day, and dollar amounts.
Batching: Group repetitive actions to prevent alert fatigue.
The point of HITL isn't to keep humans clicking "Approve" forever. It's to put human judgment where failure is expensive or irreversible, while keeping policy-required gates in place even as lower-risk automation improves.
You now understand how to classify agent actions by risk, pause execution with durable checkpoints, resume safely with compare-and-swap semantics, and design approval UIs that give humans real context. You also know the common traps: in-memory blocking, stale approvals, alert fatigue, and prompt injection through the approval message itself.
The next step is applying those approval patterns to AI-assisted software work. Coding agents can create useful patches, but they need the same risk tiers, review gates, data-minimized audit evidence, and human ownership before their changes reach a repository or deployment pipeline.
Symptom: Review queue grows and reviewers start blindly approving requests. Cause: Everything is routed through the same approval path, including safe or repetitive work. Fix: Add risk tiers, bounded batches, and triage while keeping pre-execution review for effects that policy requires humans to authorize.
Symptom: Approved actions disappear after deploy or restart. Cause: Approval state was stored in memory or tied to a live request thread. Fix: Persist checkpoints and approval rows durably, then resume from stored state instead of sleeping a worker.
Symptom: Two reviewers approve same request and agent runs action twice.
Cause: Approval resolution did not use version checks or compare-and-swap semantics.
Fix: Guard writes on status and version, then reject stale clicks on resume.
Symptom: Reviewer edits create unsafe tool arguments even though model proposal looked safe. Cause: Human modifications were trusted without schema, authorization, or business-rule validation. Fix: Re-validate modified arguments exactly like model-generated arguments before execution.
Symptom: Approval UI becomes a prompt-injection surface. Cause: Attacker-controlled text or raw model summaries are rendered like trusted system instructions. Fix: Separate untrusted content visually, show structured diffs, and keep authorization logic downstream from the model.
OWASP Top 10 for Large Language Model Applications
OWASP Foundation · 2025
LangGraph Interrupts
LangChain · 2024
LangGraph Persistence
LangChain · 2026
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
Miles Turpin, Julian Michael, Ethan Perez, Samuel R. Bowman · 2023
Temporal Workflow Execution Overview
Temporal Technologies · 2026
Temporal Python SDK: Workflow message passing
Temporal Technologies · 2024
Artificial Intelligence Risk Management Framework (AI RMF 1.0)
National Institute of Standards and Technology · 2023
EU AI Act: Regulation laying down harmonised rules on artificial intelligence
European Parliament and Council of the European Union · 2024