A consensus has been forming across government agencies and enterprises that human work needs to move out of the middle of every AI workflow and into a supervisory role, whether that role is described as humans on the loop rather than in the loop, operators rather than approvers, or judgment at the edges rather than the center. While the destination is widely agreed upon, the architecture required to reach it is less certain.
The point was sharpened on a recent GovExec panel that brought together leaders from the US Coast Guard, the General Services Administration, Sterling, and Kamiwaza. The conversation kept returning to a single observation: agencies still ask people to spend large portions of their day moving between systems, security domains, and approval queues, when the value of their judgment lies elsewhere. Agentic capabilities, designed correctly, should collapse that motion, and the harder question for technology leaders is what "designed correctly" actually means.
Many AI projects claim to put humans "on the loop" as supervisors, but in practice, they do the exact opposite. Usually, the AI makes a suggestion and waits for a person to click "approve" before moving to the next task. This repeats at every single step, forcing the person to stay involved in the entire process. Because the human has to act as a bridge between an AI that doesn't know the rules and a system that doesn't trust the AI to work alone, this version of “human in the loop” actually slows things down. The actual work hasn't changed, it just has a new look.
For a supervisor role to be real, the AI must be able to handle routine tasks by itself. The human needs to know the AI will stay within set boundaries, and any mistakes must be flagged instantly rather than found months later. A person can only truly step back when the system makes it safe to hand over low-risk tasks. Without that foundation, the human is forced to stay involved in every detail, no matter what their job title says.
True on-the-loop operation requires that the agent be capable of taking the routine action without human intervention, that the human be confident the action will sit inside the right boundaries, and that any deviation be visible immediately rather than discovered in a quarterly audit. The supervisory role becomes credible only once the underlying system makes most low-risk actions safe to delegate. Without that, the human stays in the loop by necessity, regardless of what the org chart calls them.
The first is permission inheritance, meaning that agents acting on behalf of a user should carry the appropriate entitlements of that user based upon the task. If the user cannot see a document, neither can the agent. If the user cannot move funds beyond a threshold, neither can the agent. Unlike service accounts with their own broad permissions, Relationship-Based Access Control encodes permissions directly.
The second is just-in-time credentialing for elevated actions. Most organizational work is routine, and only a small portion is consequential, so the architecture should treat the two classes differently, with the agent holding only the credentials it needs for routine work and elevation requested explicitly at the moment a higher-risk action is invoked. The Coast Guard's Jonathan White offered the most useful framing of this point on the recent panel. He compared an unscoped agent to handing a four-year-old an unlocked phone. The device can do anything the parent's account can do, which is precisely why the phone should sit in kids' mode by default. Agents need the same posture, with permissions earned for the action rather than granted in advance.
The third is identity-resolved audit, requiring that every action an agent takes, on whose behalf, against which policy, with what scope, and with what outcome, be logged in a form that satisfies both internal control reviews and external regulators. Non-repudiation, the property that an action can be unambiguously attributed to the right person and the right authorization, becomes the deciding factor in whether agentic deployments scale past departmental pilots.
The fourth, and the one most often missed, is current context, because agents that reason over stale policy mappings or out-of-date entity relationships will produce confident answers grounded in information that no longer reflects how the organization operates. A human supervisor cannot remain on the loop if the agent keeps escalating because its context is wrong. The context layer has to keep pace with the business, refreshing as policies update, as identities change, and as the underlying systems evolve.
These four properties are interrelated, because permission inheritance without current context produces agents that act on yesterday's rules, and audit without scoped delegation produces a record of actions no one had clear authority to take. The architecture has to advance on all four at once.
A pattern is appearing in agency and enterprise programs that claim on-the-loop operation but cannot demonstrate it. Agents are given broad service-account access because narrow access was too hard to configure, audit logs are produced but not joined to identity so the trail terminates at a service principal rather than a person, and the context layer is built once, manually, and never refreshed. The human supervisor, faced with an agent whose actions cannot be cleanly traced back to a person or a policy, defaults to approving every step, and the promise of compression evaporates.
The failure is rarely visible in a demo and tends to surface only in production, in the form of an audit finding, a customer-facing mistake, or a quiet retreat to manual workflow because the alternative isn't auditable.
The reward for getting the architecture right is substantial. Routine work compresses into agent execution that the human supervises by exception. Judgment moves to the moments it matters: the ambiguous claim, the unusual transaction, the policy interpretation that requires a person who understands the institution. Throughput compounds, because the human is no longer the queue.
The workforce implication is meaningful and increasingly well understood. Junior staff equipped with capable agents produce what would historically have been mid-level output, senior staff become the domain experts who supervise the loop, and the middle layer compresses accordingly. For organizations that build the architecture deliberately, the transition is manageable, while for those that retrofit it onto a permissions and context model never designed for autonomous software, the transition is painful and the gains stay theoretical.
While everyone agrees on the goal of "on-the-loop" AI, the real strategic challenge is ensuring the underlying architecture actually enables it, rather than just repackaging manual "in-the-loop" processes as something new.
These four core characteristics serve as a critical litmus test for organizations. Any government agency or commercial enterprise aiming to implement agentic AI at scale within the upcoming year must evaluate their proposed deployments against these standards. Specifically, they must verify if permissions are properly inherited from the user, if access elevation occurs only just-in-time, if the audit trail is both continuous and linked to identity, and if the context layer remains fully up-to-date. If a deployment fails to meet even one of these criteria, human intervention will remain a necessity in the workflow, regardless of what is depicted on an architecture diagram.
The full panel discussion is available on demand.