Intelligence Delivery versus Intelligence Consumption: Part 2

AI orchestration is defined by the delivery and consumption of AI inference

In Part 1 of this series, I demonstrated how the enterprise application development and operations market is divided into two categories of products, divided by the line between capacity consumption and capacity delivery. This state of being is driven in large part by the separation of buyers and budgets in the enterprise around those responsibilities. Operations is responsible for making sure there is network, storage, and compute available for applications, and development. Application operations are responsible for writing applications with positive impact on outcomes that then consume those resources.

capacity_diagram

In this post, I want to explore how AI is evolving in a similar way, but how the line between “consumption” and “delivery” is defined by a different need: the need for intelligence. I will argue here that the AI market will split into two distinct layers: the tools and platforms that generate and send prompts, and the orchestration, intelligent infrastructure that processes them.

Intelligent Infrastructure

The effective 'API' of the AI market consists of the prompt and the underlying logic—augmentation and inference—required to answer it.

Here’s why I think that is so.

What do AI Applications and Agents Want?

Before I begin, I want to lump applications that use AI and agents into the single term “AI software.” This will make what I am about to argue much more readable.

Let’s start with a simple question. If you write software that intends to use AI to solve a problem, generate an output, or provide any other valuable outcome, what do you need to make that happen? What is it that AI software wants from AI?

I would argue that the answer is simple. It wants to submit a prompt, and get an intelligent answer in response. That’s it. There may be some requirements to retrieve data beforehand to include in the prompt, or to validate that it’s secure to submit the prompt, but for the most part the AI software tier wants to be able to ask questions or submit requests that aim to achieve a certain objective, and get a response that meets that objective.

Period.

Everything else that a model, AI orchestrator, or other backend platform may do is secondary and in support of that first objective. So everything in the “backend” of the AI “stack” is ultimately serving the key purpose of taking a prompt and providing a response—efficiently and securely.

The Prompt is the Interface

This means that the “interface” between the application/agent tier and the models that provide intelligence is the prompt and whatever API triggers inference. The language and structure used doesn’t really matter. , As long as the AI models in the orchestration layer can interpret the tokens and generate a reasonable answer, the software is happy.

This is profound, in that it means AI infrastructure can be built in a way that the management of inference and AI models can be decoupled from the management of the applications that use it. This is already evident in the way AI chat interfaces work. There is no special code or configuration required to do sophisticated things with ChatGPT or Claude. You just need the right prompt, written in a language those models can interpret.

The same can be true for your enterprise AI operations. Today, you can deploy shared models, build RAG that accesses your existing data systems, and even add guardrails like access control, prompt validation, and hardware resource management to remove those concerns from application and agent builders. The only question is how much effort is involved.

AI Orchestration is the Key

One way to remove much of the complexity of supporting disparate AI systems is to adopt an AI orchestration layer that coordinates things like security, data retrieval, and inference execution across models, locations, and operations approaches.

Think of it like this: what are the things an enterprise wants to do when it receives a prompt from AI software (or even a human through chat interfaces)? I think the following is a minimal list:

  1. Validate that the prompt submitter has the right to access the data, models, and infrastructure that is required to fulfil the prompt.
  2. Determine the locations and models which will satisfy the prompt. (Which are chosen may be due to physical constraints, regulatory limits, or simply efficiency.)
  3. Verify that there is capacity to deliver a response to the prompt and deliver inference processing to that capacity. (Capacity, in this case, is mostly memory and the right processors.)

AI orchestration that automates these concerns removes a lot of the toil and risk that comes with deploying and operating AI systems. Which, in turn, makes it cheaper and less risky to experiment with, scale, and harden the systems that create great business outcomes. 

I find it hard to believe that an enterprise will successfully scale with solutions that don’t respect this boundary between creating prompts and responding to prompts. The dependencies that are created when the elements that are asking the questions are also the elements that are answering them are almost impossible to coordinate in a large organization. Too many deployment dependencies. Too many compliance conflicts. Too many security risks.

But scale is much easier to achieve and risk is much easier to control with an AI orchestration layer that provides common guardrails, infrastructure, and data access.

Who do you turn to for AI orchestration?

Which brings us to how Kamiwaza can help your organization scale AI operations today, or enable you to find value from AI applications and agents in the near future. In the last post in this series, I will go through all of the things Kamiwaza does to meet these objectives safely and efficiently. I’ll even go into some innovations that enable you to do things that you haven’t been able to do until now. 

Are you ready to be able to manage AI consumption independent from AI delivery? Can you see the advantages that would bring your organization as you begin this challenging journey into new architectures, practices, and—yes—new solutions?

As always, I write to learn. Please don’t hesitate to comment below whether you agree or disagree. Your insights or questions are always incredibly welcome.

Share on: