What is the Inference Mesh?
Run AI inference directly at the data source
The Inference Mesh is Kamiwaza's distributed AI inference layer. The Inference Mesh enables AI models to run directly at the data source, rather than requiring data to travel to a central compute environment.
How It Works
Rather than sending your data to a cloud for AI processing, the Inference Mesh deploys AI inference capabilities wherever your data already lives. Lightweight inference components operate on cloud nodes, on-premises servers, edge gateways, and distributed infrastructure. The Inference Mesh coordinates these distributed compute resources, directing inference tasks to the optimal location based on efficiency, latency, and security requirements.
Silicon-Agnostic by Design
Kamiwaza is hardware agnostic. The Inference Mesh runs on any hardware — NVIDIA GPUs, Intel, AMD, or Ampere CPUs, and others, without requiring specific infrastructure investment. Organizations leverage existing hardware rather than migrating to a specific cloud or compute platform.
Key Capabilities
- Distributed inference across cloud, on-premises, and edge environments simultaneously
- Adaptive model selection based on task requirements and available compute
- Intelligent model splitting for workloads spanning multiple nodes
- Unified memory awareness across distributed compute environments
- Validated on NVIDIA DGX Spark, Intel, AMD, and Ampere platforms
The Core Advantage
Every other enterprise AI approach requires data to travel to compute, to a central cloud where AI models run. The Inference Mesh inverts this: compute travels to data. This eliminates the security risk, compliance exposure, and latency of data movement, making AI feasible in environments where centralization is impossible.