Data Sovereignty Beyond Political Borders

Written by James Urquhart | Feb 3, 2026 2:15:00 PM

One interesting problem that AI brings to the light in a way that previous technologies didn’t is the importance of understanding logical boundaries in the movement and processing of data. As AI makes it simpler to pass information between agents, humans, and other systemic actors, data that was previously “protected” by logistical barriers now becomes increasingly subject to leakage outside of its intended uses.

OpenAI’s announcement of ChatGPT Health and Anthropic’s competing announcement are fantastic metaphors for why data access control is such a critical element of AI orchestration in the enterprise. Not only are they great examples of widely variant data domains, but also of how these domains can both overlap and conflict with each other. The result is a lot of complexity in defining access rules.

There are three key analogs that enterprises should consider when protecting their data:

While enterprises frequently store and protect individual data, individuals (and their governments) are still demanding control over who uses it and where it can be used.
The same holds true for B2B data. While an enterprise may entrust their data to be stored in the cloud somewhere, they still want complete control over who (or what) can access that data.
Protecting different data sets independently is not enough. Increasingly, enterprises are going to find themselves having to define access in terms of complex relationships across data sets.

When it comes to self-control of sensitive information about a single person, we call that “individual data sovereignty”. When it comes to the same for an organization—a business, an institution, or an agency—we call that “organizational data sovereignty”.

I’d like to explore ways that enterprises are responsible for data sovereignty. How do you manage that in an age where AI systems can potentially self-organize and attempt to share data unexpectedly?

Enterprise Data Sovereignty

Let’s start with a simple definition of sovereignty:

Sovereignty is generally defined as supreme, independent control and lawmaking authority over a territory. (via Wikipedia)

As you can see, the term is generally reserved for political contexts: it represents the power a nation-state has to define its own laws and decide its own policy.

However, when it comes to data sovereignty, I might define the term a bit differently:

Data sovereignty is supreme, independent control and authority of an entity over some set of data that represents that entity.

So, for example, individual data sovereignty is still about control and decision making authority, but it is assigned to an individual and specific data related to that individual.

Because our society (at least in the United States) sees health data as deeply personal and potentially damaging in certain situations, we have assigned a fair amount of control over healthcare data to the individual—even if it was generated by a second or third party.

(The same, by the way, is true of other key data sets, such as financial data and employment records.)

Enterprises, too, have strong interests in maintaining control over certain data. This may be because data is critical to creating a competitive advantage, or represents something that could harm the company, or even is legally mandated to be kept confidential under certain conditions. Some examples include financial information, human resources data, or even data being managed on behalf of individuals with sovereignty over that data.

However, when data is shared outside of controlled environments, that data can “leak” into contexts that the enterprise did not approve. Certain forms of access control have traditionally been used as one part of a solution to avoid this, but traditional forms of access control fall short in the era of AI agents.

For example, some agents gather aggregate data on behalf of users that don’t have the right to see the individual records. But some users might have the right to ask for specific detailed data only while they are working on a related task. How do you set roles for the agent to let them query sensitive sources, or quickly identify role changes for the users that would restrict detailed access?

Access Control and the Mosaic Effect

Data has traditionally been managed in a number of different ways, but two of the most common and most important are architecture and access control.

Architecture can be used to place logical boundaries around different elements of an application system, such as only placing components capable of performing certain tasks behind firewalls that restrict access to the required sensitive networks.

This works fine in a world where the structure of the systems changes slowly and predictably, but it fails if rapid adaptation makes architecture management unwieldy. Agents performing arbitrary tasks can find ways (perhaps accidentally) to extract data across intended logical boundaries.

As for access control, we have an entire white paper that explains the challenges of traditional methods in the age of agentic AI, so I’ll just quote from that to demonstrate the issue:

In order to be useful, an LLM-powered copilot or agent is often connected to broad repositories, like data lakes, file shares, wikis, ticketing systems, and collaboration tools. In that environment, object-level authorization alone [like role-based or attribute-based access control] can become context-blind.

[They check] whether a user can access a given item, but [they don’t] rationalize how many permitted items can be combined to derive restricted conclusions.

This is the inference problem sometimes described as the mosaic effect. Sensitive outcomes can be inferred by synthesizing many individually permitted fragments.

The complexity of managing what data can be used in what contexts is daunting when systems are self-organizing and “intelligent” enough to work out alternative ways to dig up an answer.

Relationship-Based Access Control (ReBAC)

Kamiwaza introduces a concept that can only work in the age of general AI: Relationship-Based Access Control, or ReBAC. Again, from the white paper:

Kamiwaza applies relationship-based access control (ReBAC). With ReBAC, authorization is based on how users and resources are connected through real enterprise relationships, like team, project, workspace, folder hierarchy, deal room, and data domain. This relationship model is represented in a context graph (often ontology-backed) that captures business structure and the relationships that matter for governance.

Unlike role-only decisions (“who you are”) or policies that rely heavily on attributes in isolation (“what labels you carry”), ReBAC answers, “What is your relationship to this data in this context?” That’s the unit of control enterprises need when agents traverse large graphs of related information.

I think of this a little like the way routes are organized for air travel. Any given commercial airplane capable of completing a given route can be assigned to that route, but airplanes not assigned to the route are forbidden to follow that exact path at that exact time. However, they may share part of the route in order to reach their own destination. Or, they may leave at a different time on the same route. Or any combination therein.

Connecting a user or an agent—or both—to the data they need to satisfy a prompt may require understanding how and when that data is going to be used, what other data is going to be collected, and whether or not other factors are at play. Factors such as the prompter being part of a given team, acting on behalf of a given customer, or being located in a given country.

ReBAC enables access rules to be defined based on these relationships. It uses the graph of data relationships that Kamiwaza’s Distributed Data Engine creates when it ingests data and creates (and maintains) an ontology for your business. Using a zero trust approach, ReBAC can incorporate multiple relationships into a single rule to specifically enable access to data.

For more on how ReBAC works and how it enables safety and control in the chaos of large scale agentic systems, I encourage you to read the white paper.

The concept of digital sovereignty is something that will be in the back of every enterprise’s mind as we continue to build this brave new world of intelligence and interconnectivity. Kamiwaza is built specifically to address all kinds of sovereignty, including political, individual, and organizational types. We’d love to show you how.

As always, I write to learn. Ask questions or let me know what you think in the comments below.

View full post