Why Enterprises Need MCP Governance Now

Obot AI | Why Enterprises Need MCP Governance Now

The Week MCP Got Complicated

A wide network of scattered MCP server nodes connected by overlapping, unstructured lines with no central control, illustrating ecosystem sprawl and lack of governance.

MCP governance is no longer a future-state concern. The infrastructure decisions organizations make in the next two quarters will determine whether their agent deployments are auditable, controllable, and secure, or whether they’re inheriting a sprawl problem that compounds faster than any team can manage manually.

Last week handed the industry two concrete data points. Perplexity’s CTO announced at Ask 2026 that the company is stepping back from MCP as an internal execution layer, citing token overhead and authentication friction. On the same day, Cloudflare published a detailed analysis of why direct tool calling breaks down at scale and what the fix looks like. Read together, these developments tell a specific story: MCP is maturing under pressure, the gaps are being named publicly by organizations with real production stakes, and the protocol’s role is narrowing toward what it does well.

What neither announcement resolves is the governance problem. More than 5,800 MCP servers are in the ecosystem. SDK downloads exceed 97 million per month. That growth does not pause while engineers debate execution paradigms.

Obot helps teams govern MCP at scale

What Perplexity Actually Said, and What It Means

At Perplexity’s first-ever developer conference, Ask 2026, held in San Francisco on March 11, CTO and co-founder Denis Yarats announced that the company is moving away from MCP for its internal systems, preferring classic APIs and CLIs instead. As reported via awesomeagents.ai’s coverage of the Ask 2026 conference, the decision centers on two concrete operational friction points: high context token consumption from round-tripping intermediate results back through the model, and authentication friction that creates real production headaches.

A sequential workflow where an AI model repeatedly routes between multiple tools in a looping chain, with glowing paths highlighting increasing latency and token usage.

That is a precise and narrow critique. Yarats is describing a problem with MCP as a runtime executor for multi-step agentic tasks, where every tool result passes back through the neural network just to be forwarded to the next call. Each round-trip burns tokens and adds latency. Combined with what Aakash Gupta noted on X regarding MCP’s security posture, including near-zero authentication across nearly 2,000 scanned servers and a spec that has not been updated since November 2025, the operational case for pulling back from MCP as an internal execution layer is straightforward engineering judgment, not a philosophical position against connectivity standards.

The Irony Worth Noting

On the same day Yarats made this announcement, Perplexity’s own developer documentation was shipping an official MCP server with one-click install for Cursor, VS Code, and Claude Desktop. The company is targeting $656 million ARR by end of 2026, with APIs already embedded in hundreds of millions of Samsung devices and integrations across six of the Mag 7. Revenue at that scale does not flow through protocols with unresolved MCP OAuth authentication models.

The apparent contradiction resolves cleanly when you separate the layers. MCP as a discovery and connectivity standard still has real value; Perplexity’s developer-facing MCP server is evidence of that. What Perplexity is stepping back from is using MCP’s direct tool-calling mechanism as the execution backbone for internal agentic workflows.

This is precisely why MCP governance infrastructure matters more now than it did six months ago. The protocol is maturing under pressure, and the gaps being named in public by a company of Perplexity’s scale are the same gaps that will drive standardization forward.

The Technical Case: Why Direct Tool Calling Struggles at Scale

Perplexity’s operational critique lands harder when you understand the underlying mechanics. The friction points Yarats named are symptoms of a structural mismatch that Cloudflare’s engineering team has documented in their Code Mode analysis.

Why LLMs Are Poor Tool Callers by Default

The first problem is a training data problem. Large language models have been trained on an enormous corpus of code: billions of lines of TypeScript, Python, shell scripts, and API documentation. Tool calling schemas, by contrast, are nearly absent from that training data. Cloudflare’s framing is direct: asking an LLM to use tool calling is like putting Shakespeare through a one-month Mandarin course and then asking him to write a play in it. The model can follow the syntax, but it is working against the grain of everything it learned.

The second problem is sequential inefficiency. In a standard multi-step agentic task, every intermediate result routes back through the model before the next call can be made. Call tool A, result returns to the LLM, the model reads it, calls tool B, result returns, the model reads it, calls tool C. Each hop burns tokens and adds latency. For a three-step task this is annoying. For a fifteen-step workflow, it becomes a real throughput bottleneck.

The third problem is context explosion at scale. Cloudflare’s own API has over 2,500 endpoints. Representing those as individual MCP tools would consume more than 2 million tokens in a single context window.

The Code-as-Orchestration Pattern

Cloudflare’s proposed fix reframes what the LLM is doing. Rather than calling tools sequentially, the model writes typed TypeScript code against converted MCP server schemas, and a sandboxed runtime executes the whole block in one pass. Intermediate values stay inside the code; they never re-enter the model. According to the r/mcp discussion covering Cloudflare’s analysis, this approach reduces token consumption by 32 to 81 percent depending on task complexity.

A side-by-side comparison showing inefficient step-by-step tool execution on one side and a clean, contained sandboxed execution flow on the other.

This pattern is gaining real engineering traction beyond Cloudflare. Pydantic’s Monty implements the same sandboxed bytecode VM concept for Python stacks. Zapcode, an open-source project surfaced in the same community discussion, applies the identical architecture to TypeScript using a Rust-based interpreter with 2 microsecond cold starts and suspend/resume support for long-running tool calls. Three independent implementations of the same thesis is not coincidence. It is convergent engineering judgment.

MCP governance questions sharpen considerably in this context. The protocol’s role is narrowing toward what it does best: discovery and schema publication. The execution layer is being handed to sandboxed runtimes that never expose raw credentials to the model and enforce deny-by-default network access. That separation of concerns is exactly the kind of architectural clarity that security and compliance teams can reason about.

The Part Nobody’s Talking About: MCP Sprawl Doesn’t Care Which Paradigm Wins

Whether your organization lands on direct tool calling, sandboxed code execution, classic APIs, or some combination that shifts by team and use case, MCP servers are multiplying. The ecosystem now numbers more than 5,800 servers, with SDK downloads exceeding 97 million per month. It accelerates regardless of which execution pattern wins inside your infrastructure.

MCP Governance Is the Constant Across Every Paradigm

Multiple independent execution paths converging into a single centralized governance layer that controls and routes connections across the system.

The auditability questions, the access control requirements, the OAuth authentication gaps flagged in the r/mcp discussion, none of those go away when you switch execution models. A more heterogeneous environment makes them harder. When some teams run Code Mode against MCP schemas, others use direct tool calling, and a third group went back to CLIs, each path still touches MCP servers. Each path still requires knowing which servers are authorized, which credentials are being used, and who approved the connection.

The early REST API era offers a useful parallel. When the industry was fighting over REST versus SOAP, the debate was genuine and consequential. REST won. But that outcome did not make API gateways optional; it made them necessary infrastructure, because the volume and diversity of API traffic scaled faster than any individual team could manage by hand. The gateway layer was not a bet on a specific protocol. It was a bet on the inevitability of scale and complexity.

MCP is at the same inflection point, and the Obot MCP Gateway is built on the same logic. It provides centralized visibility and control across MCP deployments without requiring organizational consensus on which execution pattern to standardize. Teams can evolve their approach. The governance layer stays constant, because the need for centralized audit trails and controlled server access is a function of how fast MCP is spreading, not how tools get called.

What Enterprise Leaders Should Actually Do Right Now

Don’t Let One Company’s Internal Pivot Drive Your Roadmap

An organized network of MCP servers arranged in a structured grid, connected through a central control layer that enables visibility and coordinated management.

Perplexity’s decision to step back from MCP as an internal execution layer is legitimate engineering judgment, not a verdict on MCP as infrastructure. The protocol’s standardization value, particularly for tool discovery, schema publication, and connectivity across heterogeneous environments, remains intact regardless of which execution pattern wins at the runtime layer. Organizations that pause or cancel MCP investments based on a single public announcement are conflating two separate architectural questions.

Audit What’s Already Running

Shadow AI deployments are compounding faster than most governance functions can track. Before adding a single new MCP server, your team should know exactly how many are already running, where they live, what credentials they’re using, and who approved them. The r/mcp community discussion covering the Perplexity and Cloudflare developments flagged near-zero authentication across nearly 2,000 scanned MCP servers. When organizations do this audit for the first time, they find more than they expected.

Use Obot to turn MCP sprawl into a governed catalog

Evaluate the Connectivity Layer Separately from the Execution Layer

MCP can remain your standard for how tools get discovered and authorized even if sandboxed code execution becomes how those tools get called. These are separable architectural decisions. Locking them together forces a false choice. The pragmatic path is treating MCP as the control plane for tool access while remaining flexible about execution patterns as they mature.

Get Centralized Logging in Place Before You Scale, Not After

Centralized audit trails, controlled server catalogs, and resolved MCP OAuth authentication flows are not features you can retrofit cheaply once your agent infrastructure has sprawled across a dozen teams. The organizations that will be ahead of this problem in twelve months are the ones building that visibility layer now, while the footprint is still manageable.

The confident position is neither abandonment nor uncritical adoption. It is the same posture good engineering teams take with any fast-moving infrastructure: instrument it, audit it, and govern it before you scale it.

The Right Abstraction Layer Is the One You Can Govern

A stable central governance layer anchoring a system of changing outer network layers, illustrating governance as a constant amid evolving execution patterns.

The execution debate will continue. Sandboxed runtimes will mature, authentication standards will consolidate, and some version of the paradigm question will resolve over the next twelve to eighteen months. What won’t resolve on its own is the sprawl already in motion: 5,800-plus servers, 97 million monthly SDK downloads, the shadow deployments your audit hasn’t found yet.

The organizations that come out ahead won’t be the ones who called the paradigm debate correctly. They’ll be the ones who built the visibility layer before they needed it, with centralized audit trails, controlled server catalogs, and resolved OAuth flows that hold regardless of which execution pattern each team ultimately lands on.

The Obot MCP Gateway is built for exactly this moment: MCP governance as constant infrastructure, stable across whatever architectural shifts come next.

Related Articles