← Blog

Who Holds the Keys to the Agent Web - Part 1: Before the Doors Close

Lida Liberopoulou ·18 March 2026 · CC BY-SA 4.0

Raw .md ↗ Share on LinkedIn ↗ Share on X ↗

Independent research. Available for advisory, research collaboration, and expert consultation. Get in touch

The AI agent layer is forming right now. Open protocols, file-based discovery, permissionless coordination, all the architectural ingredients for a decentralised agent web already exist. But the capture mechanisms are forming just as fast, and this time they are not coming for the user's attention. They are coming for the credentials. Authentication, who the agent is, what it is allowed to do, on whose behalf, is where the open window closes if we are not paying attention. And unlike the last time when this happened, the argument for closing it this time is security not convenience

Before Facebook, before X, and before gigantic centralized social media platforms dominated today's discourse, there were independent blogs. For a short period from the late 1990s to the early 2000s, there was an explosion of platforms that let people without technical expertise publish and maintain websites where they posted articles in reverse chronological order on any subject you could imagine. It was an exciting time, when online discussion began to expand beyond the enclosures of old mailing lists and newsgroups into the wider world. These sites started linking to each other, arguing with each other, and building something that felt like both a community and an infrastructure. We had the strange and genuine experience of discovering other people's thinking through an open protocol that nobody owned.

Anyone could publish, anyone could subscribe and discovery happened through a format so simple that any software could read it. No company decided what surfaced and no algorithm mediated between the writer and the reader. The relationship was direct, and it worked because the connective tissue, RSS, was an open format that belonged to no one.

Why the same capture will not work twice

RSS was a structured XML file that described what a site had published, updated automatically, readable by any software that understood the format. A feed reader could subscribe to a thousand blogs and present their updates in a single stream. No API key, authentication or platform permission was needed. The feed existed at a URL, and if you knew the URL, you could read it.

RSS was a discovery and communication protocol that was radically simple, universally readable, and owned by nobody. It did not require the publisher and the subscriber to use the same software, be on the same platform, or agree to terms. The protocol was the only shared layer, and it was open.

This produced a specific kind of network. Millions of independent nodes, each publishing a feed, each discoverable by anyone running a feed reader. The aggregation happened at the edges and every reader assembled their own view. There was no canonical feed, no trending page, no recommendation engine. The topology was genuinely distributed.

And it worked. For the better part of a decade, the blogosphere was the most vibrant, intellectually alive space on the internet. It was simple enough that nothing could break it and nobody could own it.

Then Google Reader centralised RSS consumption into one free, excellent product. Then social platforms offered algorithmic discovery, not "here are the feeds I subscribe to" but "here is what is trending right now." Then Google killed Reader in 2013, citing declining usage that had declined partly because Google had stopped investing in it. The open protocol still existed, it was the ecosystem that depended on the centralised aggregator that collapsed. The whole arc from open protocol to platform capture took about ten years. The technical infrastructure survived but the social infrastructure did not. And the people who built the capture mechanisms made the same argument every time: we are just making it easier for users.

The RSS capture was a story about human behaviour. People chose convenience over freedom. First Google Reader was easier than managing your own feed reader. Then Facebook was more engaging than a chronological RSS stream. Humans formed habits, developed preferences, resisted switching and the platforms exploited all three.

Agents have none of these properties. An agent does not prefer a nicer interface. It does not form habits or resist change. An agent will read a well-known JSON file at a standard URL exactly as willingly as it will call a proprietary API, in fact even more willingly because the open format is simpler to parse. You cannot capture an agent through convenience, because an agent has no comfort zone to exploit. You cannot capture it through engagement, because it does not get bored. And you cannot create switching costs through familiarity, because it will use whatever it is pointed at without complaint.

This means the capture cannot work at the level of the agent's own behaviour. It has to work somewhere else: at the level of the developer who configures the agent, the infrastructure the agent depends on, or the services the agent needs to access. The capture targets the human decisions and institutional defaults that determine what the agent can reach, not the agent itself. Understanding this is the difference between building the right defences and building the wrong ones. The RSS-era defence was "choose the open option." That assumed the user was the one being captured. In the agent era, the user is captured indirectly through the infrastructure choices that are made once and inherited by every agent that runs inside them.

The agent layer right now

The current AI agent ecosystem is at a structural moment that rhymes with the RSS era but does not repeat it. Protocols are forming, the conventions are emerging and the capture mechanisms are becoming visible. But the nature of what agents do means the capture works at a different layer.

The discovery layer is RSS-simple. Multiple projects have independently converged on the same architecture: plain text files describing agent capabilities. Claude Code uses CLAUDE.md. Google's Workspace CLI ships with AGENTS.md, CONTEXT.md, and over a hundred SKILL.md files. The AGENTS.md convention is spreading across repositories. The llms.txt proposal puts a markdown index at the site root for AI crawlers. Cloudflare now serves markdown to agents via HTTP content negotiation.

These conventions are the closest thing to RSS simplicity in the agent ecosystem. A file at a known path, human-readable in any text editor, with no authentication, no SDK, and no registry. Create the file, run the agent, it works.

But a practitioner caveat is warranted: major LLM companies are not yet reliably fetching these files. Millions of sites publish llms.txt; the powerful hosts mostly ignore it. Some dismiss these conventions entirely, one prominent voice declared llms.txt "useless" on the grounds that agents are smart enough to read whatever humans already built. That argument confuses execution with discovery. An agent can read a website without llms.txt the same way a human could discover blogs by typing random URLs. This is technically possible but practically useless at scale. Discovery conventions are not execution abstractions but signposts. And the fact that major platforms currently ignore these signposts is itself evidence of the capture dynamic, not evidence that the signposts are unnecessary.

The discovery layer is structurally sound but socially unproven. But RSS was already structurally sound for years before enough publishers and readers made it socially real.

The execution layer does not need new protocols. Agents are trained on shell commands and API documentation. They can call existing APIs, run CLI tools, and read web pages. The argument that AI does not need new abstractions because it can use what humans already built holds for execution. A coding agent running shell commands is using bash, not MCP or A2A.

The protocol debate, MCP versus CLI or A2A versus direct API calls, is largely about execution convenience and standardisation. It matters for developer experience but does not determine whether the agent layer stays open or gets captured.

The problem RSS never had

RSS distributed public content and a feed reader fetched a file at a URL. No credentials needed because nothing was private. The entire protocol operated on the assumption that what was being shared was meant to be found.

Agents do not just read public feeds. They check your email. They update your spreadsheets. They file tickets in your project management tool. They commit code to your repositories. Every one of those actions requires the agent to prove who it is, demonstrate that someone authorised it to act, and do so within specific boundaries of time and scope.

This is the authentication problem, and it has no RSS equivalent.

OAuth, the standard that governs how applications access your accounts, was designed for a specific interaction: a human sits in front of a screen, clicks "Allow," and grants an application permission to act on their behalf. That flow works when the human is present but it breaks when an agent needs to act autonomously, across multiple services, without a human clicking consent screens for each one.

The numbers are blunt. Glasskube, a team that built an MCP authentication gateway, tested mainstream identity providers against MCP's requirements and summarised the result as a "dumpster fire." Only 4% of authorisation servers they tested supported the dynamic client registration that MCP's specification calls for. The Nextcloud MCP server maintains a compatibility matrix tracking authentication failures across Claude Code, Gemini CLI, and Copilot with each breaking in different ways against different identity providers. A Google Gemini CLI issue reports that the OAuth discovery flow silently drops a required URL, preventing registration entirely without manual configuration.

The MCP specification has already pivoted once, from dynamic client registration to a newer mechanism called Client ID Metadata Documents, because the original approach was operationally fragile, created database growth and abuse risks, and enterprise identity providers largely refused to support it. The newer mechanism trades one set of problems for another: it eliminates some registration failures but introduces new attack surfaces that the specification now explicitly warns about.

Why auth is the capture point

With RSS, capture happened through convenience. Google Reader was easier than managing your own feed reader. Humans chose comfort over sovereignty.

With agents, capture happens through security. You cannot opt out of authentication the way you could opt out of Google Reader. If your agent needs to access Gmail, it needs credentials. If it needs credentials, something has to manage them. The question is what that something is and who controls it.

Right now, the identity industry is answering that question fast. Auth0 has shipped "Auth0 for AI Agents" with a token vault and asynchronous authorisation. Descope has launched an "Agentic Identity Hub" for managing agent credentials and policies. Okta is pushing Cross App Access, a framework where the enterprise identity provider mediates all agent-to-application connections.

Each of these is a legitimate product solving a real problem. Each also positions its provider as the delegation broker, as the entity that sits between your agent and every service it touches, holding the credentials, managing the tokens, logging the actions.

Whoever becomes the default delegation broker for agents owns the graph of which agents exist, which systems they touch, which scopes they hold, and which workflows recur. Essentially that would create an authority graph, a map of what every agent is permitted to do across every service capturing what they (and by extension the user) are authorised to act on.

This is the new Google Reader. Not a feed aggregator but an auth aggregator. And unlike Google Reader, you cannot switch away from it without re-authorising every connection your agents depend on.

And this has consequences for both individual users and organisations.

For an individual: your agent accesses your email, your calendar, your cloud storage, your project management tool, your code repositories, all through one identity vendor's credential vault. That vendor may not see the full content of every action, but it sees the metadata: which services your agents connect to, which scopes they hold, how often tokens are issued and refreshed, which workflows recur. That is enough to map your operational patterns without reading your mail. And if you decide to switch, you will need to re-authorise every connection from scratch, every OAuth consent screen, every scope negotiation, every service-specific configuration.

For an organisation: the identity provider becomes the control plane for all agent activity. Procurement, compliance, and audit all flow through one vendor. And here is the complication that makes this different from the RSS story, many organisations will want this because distributed agent authority is genuinely harder to certify, audit, and insure. A centralised auth layer may win in enterprise the same way compliance vendors win just by being the easiest answer to "what is our approved control surface for agent permissions?" That is not capture in the adversarial sense but the structural effect is the same.

The MCP protocol itself is drifting in this direction. The latest specification update incorporates "Enterprise-Managed Authorisation," explicitly described as putting the enterprise identity provider back in the driver's seat. Registries are forming, there is already an official MCP registry, Azure's agent registry, GitHub's enterprise MCP policies. Open protocols are sprouting centralised infrastructure on top, and the infrastructure is where the control accumulates.

None of this means agent protocols are secretly closed. MCP is open source, A2A is open and the governance layer (AAIF, hosted by the Linux Foundation) is a genuine multi-vendor effort. But "open" at the protocol level does not prevent centralisation at the infrastructure level. AAIF's platinum membership costs $350,000 and includes a guaranteed board seat with full voting rights and influence over strategic direction. RSS did not require a $350,000 seat to influence the format's future. Open code and concentrated governance can coexist, and often do.

The counter-position

The capture is not inevitable. Auth does not have to centralise around cloud identity vendors. But the alternative is harder than a clean architectural diagram suggests.

Think about password managers. When you use a password manager, the remote service never knows a password manager is involved. The manager fills in your credentials at the moment of action, on your device and under your control. The manager does not need to be trusted by the remote service. It is invisible and the service sees just a normal login.

The same principle can apply to agent authentication but with an important caveat. A password manager assists a human at the moment of action. An agent delegation broker executes actions over time, unattended, across multiple services. But that is harder because things like token lifecycle management, refresh rotation, scope negotiation, revocation across services, shared-device security, recovery flows, are not "password manager simple" once the broker is operating autonomously across many APIs. It is local, user-controlled, invisible to the remote service but it compresses real operational difficulty.

With that caveat stated, here is the principle:

The agent does not need to hold credentials. The agent does not need its own cloud identity. Instead:

The agent discovers what needs to be done through the open discovery layer, file conventions, APIs, documentation. Then the agent composes the action, the specific call, the parameters, the intent. The agent hands the composed action to a local delegation broker on your device or in your infrastructure. The broker attaches the credentials, checks the policy, executes the call. The broker returns the result. The agent never sees the token.

The broker is local. It is yours and it holds your OAuth tokens the way a password manager holds your passwords. It runs on your machine or your server, not on a vendor's cloud. It is open source, self-hosted and user-controlled.

The policy is a file. A readable, versionable document that defines what the agent is allowed to do, for how long, against which services, with what constraints. It is not an opaque dashboard in someone else's product. It is just a file in your project, next to your other project files, inspectable by the same tools you use for everything else.

Every privileged action produces a receipt. What was called, what scope was used, what the agent's intent was, what the result was, when, which model was involved. The receipts live in your files and they are portable across tools and time.

In practice, this is what it looks like. You tell your agent: "reply to that email from the accountant and schedule a follow-up meeting for Thursday." The agent drafts the reply and composes a calendar invite. It hands both to the local broker. The broker checks the policy file, email: allowed, calendar: allowed within working hours, no external attendees without confirmation. The broker uses your Gmail token for the reply and your Calendar token for the invite and then both calls execute. The broker writes a receipt: email sent to this address at this time using this scope, meeting created for Thursday 2pm, model was Claude, agent session ID logged. You can read that receipt in a text editor. You can audit it next month and you can move it, both the policy file and the broker, to a different agent next year. Nothing about this flow requires a cloud identity vendor. Nothing about it is visible to the remote services. Gmail sees a normal OAuth token. Google Calendar sees a normal OAuth token. The delegation happened locally, the policy is yours, and the proof lives in your files.

This does not exist yet as an assembled product but the components are real. Bitwarden ships an MCP server explicitly designed for local use, the credentials stay on your machine and the repo warns against public exposure. Docker's MCP Toolkit manages OAuth automatically behind a local gateway with default-deny filesystem access and tool request interception. What is missing is the combination: local credentials, local policy enforcement, and portable receipts in one thing a non-technical person can install in an afternoon. That gap is the opening the cloud vendors are filling and will continue to fill until someone assembles the local alternative.

And permissionless coordination is not a fantasy. It is already working at the hardest layer. In March 2026, a team published Covenant-72B — a 72-billion-parameter language model trained across seventy permissionless peers on commodity internet, with no centralised cluster, no whitelist, and anyone free to join or leave. The model is competitive with centralised baselines and ships under Apache 2.0. According to their summary: "The constraint was never physics. It was always coordination." If permissionless coordination can produce a competitive foundation model over ordinary internet connections, then permissionless agent discovery and portable context over open file conventions is trivially achievable by comparison.

The discovery layer will stay open because it is simple enough that capture has nothing to grab. The execution layer will stay open because agents can use what already exists. Auth is where the fight is. And the fight is not about which protocol wins but about whether the delegation broker that manages your agent's permissions is something you run or something a vendor runs for you.

What has to be true

I have to be honest here. I do not know if the open version wins. The cloud identity vendors are well-funded, well-positioned, and solving a real problem. Enterprises will centralise around managed identity providers regardless, that is what enterprises do, and for legitimate governance reasons.

But the default for individuals, small teams, and independent builders does not have to be a cloud broker. The local option can exist. Password managers proved that: 1Password is a cloud service, KeePass is local, both work, and the existence of the local option means capture is not the only path.

For the local option to be viable, three things need to be true:

The broker needs to be simple enough that it does not become its own project. If setting up a local delegation broker requires more work than using Auth0's hosted version, most people will use Auth0. Simplicity is not a nice-to-have. It is the structural condition for independence.

The policy format needs to converge. If every agent framework defines its own policy syntax, the broker becomes framework-specific and portability dies. A readable, common format for capability grants, what the agent can do, for whom, against which services, for how long is the equivalent of RSS for auth policy. It needs to be simple enough that nothing can break it and nobody can own it.

Services need to accept normal OAuth tokens from local applications. If major services start requiring agent-specific authentication flows that only work through approved identity vendors, the local broker model breaks. This is the political risk: the same way platforms eventually required API keys and rate-limited RSS consumption, services could require "agent-verified" credentials that only designated brokers can issue. The incentives to do so are real, things like fraud prevention, liability allocation, rate control, and the commercial opportunity of becoming a tollbooth on agent traffic. That pressure will grow as agent volume increases.

The window

I watched the open web get captured the first time. The format survived but the independence did not. The structural choice is available again, but this time the capture mechanism is security not convenience. And security is a harder argument to fight, because the need for it is real.

The discovery layer is too simple to capture and the execution layer does not need new protocols. The authentication layer (who is this agent, what is it allowed to do, on whose behalf) is the part where the fight is happening right now, mostly invisibly, in specification drafts and identity vendor product launches and enterprise governance frameworks.

The people building the capture mechanisms are not making the same argument as last time. They are not saying "we are just making it easier for users." They are saying "we are making it more secure for enterprises." And they are right because the need is real. The question is whether security requires centralisation, or whether the same security properties can be achieved with infrastructure you control.

The answer is: architecturally, probably yes. Practically, it is still unproven. A local broker with file-based policy and mandatory receipts can provide delegation, scoping, audit, and revocation. Whether it can do so reliably enough, simply enough, and at a scale that matters against well-funded identity vendors who are solving the same problem with more resources is a different question. The gap between "architecturally plausible" and "practically viable" is where the real fight happens.

And the window has a time limit. Once deployment patterns harden, once SDK defaults consolidate, once procurement pathways lock in around specific identity vendors the open option just becomes theoretical. By the time centralisation appears optional again, it is usually too late. This is the history of every open protocol that waited too long to build the alternative.

Auth may not be the only capture point. The runtime where agent plans execute, the registry where tools are found and ranked, the model provider that sees prompts and outputs all of these are candidates. But auth is the first place where open agent architecture collides with private power, and therefore the most immediate place where centralisation can harden. It is the layer where the need for security creates a genuine argument for centralisation, and where the alternative has to be not just ideologically appealing but operationally sound.

The doors are still open. But this time, the thing that could close them is not a better aggregator. It is a better credential vault. And the people building the vaults are moving fast.

Everything in this article describes what is happening now. Agents acting on systems built for humans, things like logging into your email, filing tickets, committing code. The auth question today is: who holds the credentials when an agent acts on your behalf?

But there is a longer question forming behind this one. It is not technically possible yet, and it may not be for years. But the architectural decisions being made right now will determine whether the answer is open or captured when it arrives.

The question is: what happens when agents stop acting on human-facing systems and start negotiating directly with each other? When "book me the cheapest flight to Thessaloniki that gets me back by Sunday" is not a query to a booking website but a negotiation between your agent and the airline's agent with no search interface, no human-facing application, no system built for a person to click through. The auth architecture described in this article assumes the booking system exists. But what governs the exchange when it does not?

That is for Part 2.

Published: March 2026 · Author: Lida Liberopoulou · License: CC BY-SA 4.0