From pilot to production
-
From pilot to production
-
Summary
-
Why do AI pilots fail to reach production?
-
How is moving AI to production different from scaling a pilot?
-
What does it take to move AI from pilot to production?
-
Who needs to be at the table?
-
A production example: Town of Vail housing compliance
-
How do you pick your first production AI deployment?
-
Common pitfalls
Most enterprise and government AI pilots succeed in controlled conditions and fail when exposed to production reality: distributed data, regulatory boundaries, and workflows that span departments. MIT's State of AI in Business 2025 finds that only 5% of companies have integrated AI tools into core workflows at scale.1 Production-grade AI requires a coordinated effort across business, IT, security, and data teams, supported by orchestration that handles distributed data in place, enforces governance at runtime, and provides a unified audit trail across every action.
Why do AI pilots fail to reach production?
Pilots are designed to succeed in isolation. Clean datasets, curated boundaries, dedicated resources, and a single use case create the controlled conditions where models perform predictably. Production environments offer the opposite: data spread across CRM, ERP, file stores, mainframes, and edge systems; workflows that span departments, regulatory regimes, and time zones; and security and compliance requirements that prevent data consolidation.
When organizations try to move from pilot to production, they encounter the gap McKinsey's State of AI 2025 identifies: 88% of organizations have launched AI initiatives, but only 39% report measurable EBIT impact.2 Pilots that worked on staged data break when exposed to the real architecture of the business. The failure mode is consistent and structural.
How is moving AI to production different from scaling a pilot?
Production requires the same intelligence operating across the systems, people, and rules the business actually runs on. Three differences from pilot environments matter most.
Data is distributed and cannot be consolidated. In a pilot, data sits in one staged location. In production, the same query may need information from a CRM, an ERP, a regulated database, and an edge sensor, none of which can be safely moved to a central lake without violating residency, sovereignty, or compliance requirements like HIPAA, GDPR, FedRAMP, or classification handling rules.
Permissions are dynamic and must be evaluated at runtime. A production AI agent acting on behalf of a user must inherit the appropriate permissions of that user at the moment of execution, including delegated authority and the context of the task being performed. Static role-based access control cannot evaluate this.
Workflows span multiple systems and departments. Production AI rarely solves a single-step problem. It coordinates handoffs across procurement, finance, legal, and operations, each with its own systems, approvers, and exceptions. Production deployments live or die on this complexity.
Orchestration provides the coordination layer that production AI requires across distributed environments.
What does it take to move AI from pilot to production?
A pilot proves the model can perform the task. A production deployment proves the organization can operate the task at scale, over time, under the security and compliance constraints the business actually faces. Three layers have to come together.
The platform layer. Production AI requires the orchestration capabilities described in the hub: distributed inference routing, locality-aware data access, runtime governance, a living institutional ontology, and unified observability. Kamiwaza delivers these through the Inference Mesh, Distributed Data Engine, ReBAC, Context Manager, Workrooms, and Kaizen. For a complete description, see What is AI orchestration?
The team layer. Production AI requires named owners on both the customer and vendor sides for data scope, governance, agent autonomy, human escalation, and audit. The next section addresses who needs to be at the table and what they need to agree on before the deployment goes live.
The workflow layer. Production AI requires the workflow itself to be redesigned around what agents do and what humans do. AI cannot be layered on top of the existing process; the workflow has to be rethought. The "how do you pick your first production deployment" section below addresses workflow design directly.
Without all three layers working together, a pilot stays a pilot.
Who needs to be at the table?
Production AI succeeds or fails on coordination. The first production deployment requires a defined team on both sides of the engagement and clear ownership for the decisions that have to be made in design rather than discovered in production.
On the customer side, the team typically includes a business owner who owns the workflow outcome, a security and compliance partner who can answer questions about data residency and authorization, a data or platform engineer who can connect the orchestration layer to source systems, an operations representative who will run the system day to day, and an executive sponsor who can clear blockers when departmental priorities collide.
On the vendor side, in Kamiwaza's case, the engagement includes solutions engineers who handle integration, customer success who guides adoption, and product specialists who advise on platform configuration for the customer's specific environment.
Before the deployment goes live, this group needs to agree on:
- Which data sources are in scope and which are out
- Who owns governance for each data domain
- Which actions agents can take autonomously, and which require human approval
- Where humans intervene in the workflow, and what triggers escalation
- How audit, incident response, and policy updates will be handled
The strongest deployments name owners for each of these areas in the first two weeks of the project. Production AI is an organizational capability, and it requires coordinated investment in people and process alongside the platform itself.
A production example: Town of Vail housing compliance
The Town of Vail, Colorado has spent decades building a deed-restricted housing inventory for the workforce that keeps the town running year-round. Interpreting each deed restriction required deep manual review across legacy systems like Laserfiche, often taking days or weeks. The Town had tried twice over a combined 12 months to manually classify deeds into a structured spreadsheet, and both efforts failed because forcing complex, highly variable legal agreements into rigid categories stripped out the nuance required to interpret them correctly.
The production deployment took a different approach. The Vail housing team, Kamiwaza, and HPE worked together to identify deed restriction compliance as the first workflow, then integrated Kamiwaza directly into the Town's existing Laserfiche document management system. The deployment runs entirely behind the Town's firewall, and no data migration was required.
Kaizen, Kamiwaza's flagship AI agent, reads the full body of each deed restriction document and reasons across it the way a seasoned compliance analyst would. It answers 38 compliance classification questions with evidence-grounded logic, generates structured reports with zero manual data entry, and gives staff a plain-language interface to query the entire portfolio. The workflow itself was redesigned around what the agent does (read and interpret documents at scale) versus what the housing team does (apply judgment to the cases that need it).
The result: a 90% reduction in deed case interpretation time, with community members receiving answers in hours instead of weeks. Director of Housing Jason Dietz:
"The biggest challenge wasn't the technology. It was accepting that we had never been able to unlock this level of consistency and rigor using human interpretation alone."
— Jason Dietz, Director of Housing for Town of Vail
Hoe do you pick your first production AI deployment?
The strongest first deployments share four properties.
- High pain, clear boundaries. Select a workflow where the current state is visibly painful (manual extraction, multi-day cycle times, error-prone handoffs) but the scope is well-defined. The pain justifies investment; the boundaries keep complexity manageable for the first deployment.
- A workflow worth redesigning. Choose a multi-step workflow with clear repetitive components such as document extraction, classification, routing, validation, or data preparation. These are the components AI agents handle well. Then redesign the workflow around the new division of labor: the agent handles the repetitive work, the human handles judgment, negotiation, exceptions, and final approval. AI cannot be layered on top of the existing process and expected to deliver business value; the workflow itself has to be rethought. Anchor the redesign to a measurable business outcome the organization cares about: faster cycle times, fewer support tickets, more accurate quotes, better customer experience. A clear outcome is what turns the deployment into a business initiative.
- Existing data the orchestration layer can reach. Choose a workflow where the relevant data already exists in digital form and can be reached in place. Build production confidence on what you have first rather than starting with a project that requires new data collection or system migration.
- Engaged stakeholders and an executive sponsor. The process owner needs to want the workflow change. The security and compliance partner needs to be involved from day one. The executive sponsor needs to be able to clear blockers when departmental priorities collide.
Forrester's 2025 research on orchestrating AI found that 49% of organizations are actively seeking end-to-end solutions to overcome siloed workflows and fragmented AI efforts.3 The first production deployment is the moment to establish those patterns. Subsequent deployments reuse them.
Common pitfalls
- The perfection trap. Waiting for clean data, complete processes, or ideal infrastructure before starting. Production AI improves incrementally; the first deployment is rarely perfect, and it does not need to be.
- Treating production as a fixed end state. Pilots optimize for accuracy on a fixed dataset. Production optimizes for resilience as data, regulations, and workflows change over time. Build observability and governance into the first deployment so the system can adapt.
- Underestimating the human element. Production AI shifts what people do. Plan the operational handoff: which decisions agents handle autonomously, which escalate to humans, how human corrections feed back into the system.
- Centralizing instead of orchestrating. Production rarely works through data lakes, mass reformatting, or replacement of legacy systems. Orchestrate across what exists.
Citations used on this page
- MIT, State of AI in Business 2025: only 5% of companies have AI tools integrated into core workflows at scale.
- McKinsey, State of AI 2025: 88% of organizations use AI in at least one business function; 39% report measurable EBIT impact.
- Forrester, Orchestrating AI 2025: 49% of organizations seek end-to-end solutions to overcome siloed workflows.