claude-code

Microsoft’s Agent Governance Toolkit Gives MCP Agents the Policy Layer Prompts Cannot Provide

Anatoliy Kolodkin

28 May 2026 • 5 min read

The most useful thing about Microsoft’s Agent Governance Toolkit is that it refuses to pretend prompts are a permission system. That should not be a controversial position in 2026, but here we are: teams are connecting agents to codebases, databases, ticketing systems, browsers, email drafts, cloud APIs, and MCP servers, then hoping a carefully worded instruction will do the work of access control.

InfoWorld’s fresh write-up of the toolkit lands at the right moment because the coding-agent ecosystem is converging on the same problem from different directions. Claude Code is tightening MCP and subagent policy behavior. OpenAI, Google, Microsoft, Anthropic, and the open-source agent frameworks are all normalizing tool-calling agents. MCP is becoming the shared boundary where agents meet real systems. The question is no longer whether agents can act. The question is who decides what they are allowed to do before the tool call executes.

Microsoft’s open-source Agent Governance Toolkit, released under the Microsoft organization with an MIT license, puts that decision point on the execution path. Actions are evaluated before they run, checked against deterministic policy, allowed or denied, and logged. InfoWorld reports Microsoft expects policy evaluation to take less than 0.1 milliseconds per operation. Microsoft’s own launch post describes deterministic, sub-millisecond enforcement across all 10 OWASP agentic AI risk categories, with packages covering policy enforcement, identity and trust, runtime sandboxing, SRE controls, compliance, marketplace and plugin governance, reinforcement-learning governance, and audit-oriented execution layers.

The right abstraction is a tool-call firewall

The core idea is not exotic. It is almost aggressively familiar: put a policy decision point between an actor and the thing it wants to do. Operating systems did this with process isolation and privilege boundaries. Service meshes did it with identity and traffic policy. Cloud platforms did it with IAM. CI systems do it with protected branches and required approvals. Agent systems need the same primitive because an agent with tools is not a chatbot. It is a program-shaped coworker with uncertain judgment and very high typing speed.

That is why AGT’s govern()-style wrapper and YAML policy examples are more important than the packaging. The docs show policies that deny destructive database operations such as drop, delete, and truncate, and require approval for sensitive side effects such as send_email. This is the right mental model. Do not ask the agent to “please be careful with production data.” Put a gate in front of production-affecting tools. Deny by default where the blast radius is high. Require approval where the action leaves the sandbox. Log the decision either way.

This maps directly to Claude Code and MCP. Claude Code’s official MCP docs position MCP as access to issue trackers, monitoring dashboards, databases, Figma, Slack, Gmail drafts, and event channels, while warning users to verify trust because servers that fetch external content can expose prompt-injection risk. Recent Claude Code releases have been fixing managed MCP configuration, allow/deny behavior, subagent policy inheritance, per-MCP usage attribution, and runtime warnings. That is the same category of work AGT is attacking from the framework side: make the tool boundary governable.

MCP tool poisoning is not theoretical enough to ignore

The .NET MCP example is the part practitioners should read twice. Microsoft’s post describes an McpSecurityScanner that flags a suspicious tool definition named read_flie—a typo-squatted shape of read_file—with embedded instruction text telling the model to ignore prior instructions and exfiltrate file contents. The scanner assigns a risk score of 85 out of 100 with critical tool-poisoning findings and a high typosquatting finding.

That example captures a subtle but important point: MCP tool definitions are context. A host that blindly hands tool descriptions to the model is exposing the model to executable-seeming persuasion before the user has even requested a tool call. A malicious or compromised MCP server does not have to start by running a command. It can start by describing a command in a way that changes the model’s behavior. Treating tool metadata as a security object—scan it, score it, classify it, block or quarantine it—is a better posture than treating “server connected” as success.

This is also where agent security differs from normal API security. A traditional API client does not read its OpenAPI description and become emotionally committed to violating policy. An LLM agent might ingest a poisoned tool description, combine it with user context, and choose a path the host did not intend. The enforcement point cannot live entirely inside the model’s instruction hierarchy. The host, proxy, gateway, or wrapper has to mediate what tools are registered, what inputs are allowed, what outputs return to the model, and what audit trail remains.

Cost controls belong in the same policy conversation

InfoWorld also surfaces a less theatrical but very real problem: agents are chatty. They make many context-building queries, retry failed calls, and can overwhelm APIs that worked fine when humans were the primary callers. AGT’s budget management and throttling features address this directly, including rejecting actions likely to exceed token limits and limiting API-call volume over time.

This is where agent governance and FinOps become the same spreadsheet. A coding agent connected to search, test runners, package managers, internal documentation, observability, issue trackers, and multiple MCP servers can generate meaningful spend and load without being malicious. It can simply be uncertain, over-broad, or stuck in a retry loop. A governance layer that only blocks “bad” actions but ignores cost and rate limits is incomplete. The agent does not need evil intent to create an outage or a surprise bill.

Claude Code teams should recognize the pattern. Anthropic has been adding usage breakdowns for skills, subagents, plugins, and per-MCP-server cost, plus large session-file accounting and optional OpenTelemetry attributes. Those signals help explain where the money went after or during a run. A toolkit like AGT, or an internal equivalent, is the other side of the loop: prevent runaway behavior before it becomes an incident. Observability tells you what happened. Policy decides whether the next thing is allowed to happen.

Steal the pattern even if you do not adopt the toolkit

AGT currently spans Python, TypeScript, Rust, Go, and .NET, with docs listing five languages, 19 integrations, 10 formal specs, and thousands of GitHub stars depending on the measurement path. Microsoft’s launch post says the architecture was designed around existing frameworks such as LangChain, CrewAI, AutoGen, Microsoft Agent Framework, Foundry Agent Service, Google ADK, OpenAI Agents, LlamaIndex, Haystack, Mastra, MCP, and A2A. The point is not that every team should immediately standardize on Microsoft’s implementation. The point is that the category is becoming real.

If you are building or operating coding-agent workflows, steal the architecture first. Put a policy decision point in front of high-risk tools. Keep policy in version-controlled configuration rather than scattering it across prompts. Deny destructive operations by default. Require approval for external side effects. Rate-limit broad context tools. Cap token-heavy calls. Log every allow and deny decision with the actor, tool, input class, policy version, and reason. Emit metrics your existing observability stack can consume. For MCP, scan tool definitions before exposure, sanitize tool outputs before they return to the model, and treat shadow MCP servers as production risk.

The comparison point for Claude Code is sharp. Native controls are improving quickly: managed MCP settings, scopes, /mcp visibility, tool-output token warnings, reconnect behavior, subagent policy fixes, and per-MCP usage attribution are all moving in the right direction. But native controls inside one agent runtime should not be the only governance layer for an organization. If the same MCP server is callable from Claude Code, a custom internal agent, a LangGraph workflow, and a dashboard automation, policy should not depend on each client remembering the same prompt or implementing the same if-statement.

The editorial takeaway is blunt: prompts express intent; they do not enforce authorization. MCP is too useful to avoid, and tool-calling agents are too valuable to keep in toy sandboxes forever. That means the industry needs the boring stuff now: policy engines, scoped identity, budget controls, audit logs, approval workflows, tool-definition scanning, output sanitization, and kill switches. Microsoft’s AGT may or may not become the standard implementation. The pattern absolutely will.

Sources: InfoWorld, Microsoft Open Source Blog, Agent Governance Toolkit docs, Microsoft .NET Blog, GitHub repository, OWASP Top 10 for Agentic Applications 2026

The right abstraction is a tool-call firewall

MCP tool poisoning is not theoretical enough to ignore

Cost controls belong in the same policy conversation

Steal the pattern even if you do not adopt the toolkit

Sign up for more like this.