Human Approval for AI Agents: When Agents Should Stop

When should the agent stop being autonomous?

The tempting answer is to ask for approval whenever the model sounds uncertain. That answer is not useless, but it is too vague to operate. Human approval for AI agents is the workflow gate that pauses risky actions until a person reviews the evidence, policy result, proposed action, and rollback path. It should be triggered by risk class, not by vague model uncertainty alone.

Generated hand-drawn illustration of an agent policy gate routing read, write, and external actions.

Direct answer

Human approval for AI agents is the workflow gate that pauses risky actions until a person reviews the evidence, policy result, proposed action, and rollback path. It should be triggered by risk class, not by vague model uncertainty alone.

Data note

When this matters

  • The action changes money, customer state, production systems, permissions, or public communication.
  • The agent has enough evidence to propose an action but not enough authority to execute it.
  • A reviewer needs the trace, not just the final answer.

Failure modes this page should catch

  • Approval asks a human to judge text without showing source evidence.
  • The agent requests approval after it already changed the external system.
  • All approvals look the same, even when risk levels differ.
  • Reviewer decisions are not written back into the trace.

Human approval matrix

GateSignalAction
Triggermoney, deploy, delete, send, permissionPause before action
Evidencesources, tool outputs, evals, policy resultAttach packet
Decisionapprove, reject, edit, ask for more infoWrite to trace
Rollbackhow to undo or mitigateRequire for mutation
Learningwhy reviewer decidedConvert into future rule

Running example

A sales agent drafts a customer email after reading a CRM record. Drafting is allowed. Sending is paused. The reviewer sees the recipient, claims, source evidence, policy result, and exact outbound message before approving.

Copy the working template

Use the human approval matrix above as the v1 artifact for this page. Replace the placeholders with your own agent names, tools, risk classes, and thresholds, then link the result back into your monitoring, tracing, security, and evaluation gates.

How this connects to the control-gates library

Frequently Asked Questions

When should AI agents require human approval?

Require human approval before external sends, destructive changes, money movement, production deploys, permission changes, regulated decisions, or customer-visible actions.

What should an approval request include?

An approval request should include the proposed action, evidence, source links, tool outputs, risk class, policy result, eval result, expected side effect, and rollback path.

Should every agent action need approval?

No. Over-approval makes the system unusable. Use risk tiers so read-only and low-risk draft work can continue while high-risk actions pause.

The Takeaway

Human approval is not a brake on agents. It is how useful autonomy crosses high-risk boundaries without pretending the model owns the risk.

Sources