claude-code

AI Middleware Is Becoming Critical Infrastructure, and MCP Is the Audit Boundary

Anatoliy Kolodkin

16 May 2026 • 5 min read

The phrase “AI middleware” sounds like something vendors invented to make plumbing billable. Unfortunately, it is also the right threat model. The dangerous layer in modern agent stacks is increasingly not the model itself, but the glue around it: model routers, MCP servers, plugin managers, tool proxies, local CLIs, background supervisors, and CI/CD integrations that quietly hold the keys to everything developers touch.

SiliconANGLE’s latest supply-chain analysis makes that point through a cluster of recent AI-security incidents: the TeamPCP supply-chain attack, Anthropic’s accidental Claude Code source exposure via an npm packaging mistake, Claude Mythos-style defensive AI work, compromised developer/security tooling, malicious LiteLLM versions, and undocumented MCP plugins. The useful conclusion is not “AI risk is scary.” That sentence has stopped doing work. The conclusion is sharper: agent middleware sits directly in the data path and must be treated like critical infrastructure.

The model router is now a privileged production component

The TeamPCP example is the cleanest warning. The attackers reportedly compromised trusted security and CI/CD tooling, including Trivy and Checkmarx, and targeted LiteLLM, an open-source Python library and proxy that exposes a unified interface to more than 100 large language models. SiliconANGLE names malicious LiteLLM versions 1.82.7 and 1.82.8 as containing an obfuscated, multistage credential stealer and dropper.

That matters because a model proxy is not a harmless adapter. It sees prompts, responses, headers, API keys, environment variables, routing decisions, and often the credentials required to call downstream services. In a mature agent setup, the proxy may also sit next to CI tokens, Kubernetes access, repository credentials, observability keys, and internal service endpoints. Compromise that layer and the attacker does not need to defeat the model. They can move laterally through the same channels the automation uses every day.

SiliconANGLE describes the blast radius as “thousands of potential compromises in just a few hours,” which is exactly what happens when highly connected developer infrastructure goes bad. This is the SolarWinds lesson translated into an AI-native stack. The new part is not supply-chain compromise. The new part is that agent infrastructure is being deployed faster than many teams are inventorying it.

MCP is not a side quest

The most important line in the piece is the one about undocumented MCP plugins increasing damage because they could be compromised for malicious purposes. That should land uncomfortably for anyone adding MCP servers to a coding-agent workflow. MCP is useful because it standardizes tool access: issue trackers, databases, cloud consoles, monitoring systems, design files, Slack, Gmail, internal APIs. It is also risky for exactly the same reason.

Claude Code users should read this alongside the product’s own trajectory. Plugins can package skills, agents, hooks, MCP servers, settings, binaries, and language-server integrations. Background sessions can keep running without a terminal. Agent view turns multiple sessions into a persistent operational surface. None of that is inherently reckless. It is the point of the tool. But every additional connector creates a delegated-authority boundary that needs identity, scope, logging, and revocation.

The mistake is treating MCP servers as developer conveniences rather than API gateways. If an MCP server can query a database, post to Slack, open a pull request, read production logs, or trigger a deployment, it deserves the same controls you would apply to any other privileged integration. Authentication cannot be optional. Authorization cannot be inferred from a UI hiding a tool. Credentials cannot be shared across every user because it was easy. Audit logs cannot just say “the agent did it.”

Static inventory is not enough

The old supply-chain checklist still applies: pin dependencies, review manifests, use least-privilege credentials, modernize secrets management, restrict pipeline actions, and train developers on the current failure modes. Those are table stakes. Agent systems add a harder requirement: runtime attribution.

Static inventory tells you what was installed. Runtime attribution tells you what happened. Which human launched the agent? Which identity did the agent use? Which plugin provided the tool? Which MCP server handled the call? Which downstream resource was touched? What arguments were passed, or at least what hashed/sanitized representation can be retained safely? Was the action approved, auto-approved, blocked, retried, or executed by a background worker after the terminal went away?

That is why Permiso’s related agent-runtime security announcement is directionally interesting even if it is vendor-shaped. The company describes tying every run, event, tool call, and MCP invocation to human, non-human, and AI identities; graphing which human deployed an agent, what identity it used, what sub-agents it spawned, and which systems it touched; and adding kill switches at the identity layer. Strip away the product language and the architecture is right: agents are identities with behavior, not just chat sessions with nicer UX.

Most teams are not there yet. Many have a pile of local configs, personal API tokens, experimental MCP servers, a few plugins installed from a marketplace, and background agents that feel like terminal tabs with better memory. That may be acceptable for a toy repo. It is not acceptable once the agent can see customer data, production credentials, internal tickets, private source, or deployment systems.

What engineers should do Monday morning

Build an agent runtime register. Not a slide. A real inventory. For every Claude Code plugin, MCP server, model proxy, background workflow, hook, and external connector, record the owner, version, source, permission scope, credential source, network exposure, logs retained, approval requirements, and emergency-disable path. If nobody owns it, it is not production infrastructure; it is a liability with a README.

Then test an incident instead of admiring the spreadsheet. Pick one MCP server and assume it is compromised. Can you identify which sessions used it in the last 14 days? Can you revoke only its credential without breaking unrelated tools? Can you determine which human or automation identity authorized each call? Can you see whether it touched production data? Can you disable it centrally, or do you have to message every developer and hope they remove a local config?

For Claude Code specifically, the practical controls are boring and good: maintain an approved plugin and MCP list; require authentication on remote MCP transports; use per-connector least-privilege credentials rather than developer supertokens; log tool calls with authenticated principal, session ID, plugin origin, tool name, argument hash, and downstream resource; require human approval for high-risk actions; keep secrets out of broad environment variables where possible; and rotate credentials when a plugin or MCP server is removed.

Teams should also separate model routing from authority. A model proxy can make cost, latency, and provider policy manageable, but it should not automatically become the place every secret accumulates forever. If the proxy needs credentials, scope them. If it handles sensitive prompts, monitor outbound connections. If it can call multiple providers, log routing decisions. “It is just middleware” is how privileged infrastructure becomes invisible until the breach report gives it a name.

The editorial take is simple: the next serious agent-security failures will look less like science fiction and more like ordinary appsec bugs placed in unusually powerful plumbing. Bad auth. Overbroad tokens. Unpinned packages. Hidden connectors. Missing audit logs. Background workers with stale permissions. MCP and plugin systems are not accessories to the agent boom; they are the new control plane. Treat them accordingly, or they will become the next CI/CD blind spot with a conversational interface.

Sources: SiliconANGLE, Permiso AI agent runtime security announcement, Permiso BusinessWire release, Secure Code Warrior AI security rules

The model router is now a privileged production component

MCP is not a side quest

Static inventory is not enough

What engineers should do Monday morning

Sign up for more like this.