ai-frameworks

Microsoft Agent Framework 1.9 Turns Skills and MCP Discovery Into Runtime Governance

Anatoliy Kolodkin

04 Jun 2026 • 5 min read

Microsoft Agent Framework 1.9.0 is the kind of release that looks boring until you ask the only question that matters in production: who decides what an agent is allowed to know, load, and call?

The answer used to be “the prompt,” which is convenient and mostly fictional. The new .NET 1.9.0 release points in a better direction. Microsoft is moving skills, MCP discovery, progressive tool exposure, background agents, hosted-agent persistence, observability, and eval integration into the framework layer. That is less demo-friendly than another animated agent canvas, but it is exactly where production teams need the work to happen.

The release landed June 3 with a change list that reads like an enterprise agent team’s backlog after the first real incident review. Python gets McpSkillsSource, progressive tool exposure through FunctionInvocationContext, a hosted Toolbox MCP skills sample, background agent support for harness agents, canonical hosted MCP call/result persistence, Foundry Adaptive Evals integration, native structured output support for Bedrock Converse API, and fixes around observability serialization and OTLP HTTP endpoints. The .NET side gets the 1.9 package wave, AGUI hosting and workflow fixes, ILoggerFactory and IServiceProvider support in HarnessAgent, stable promotion for declarative workflow packages, hosted agent updates, and workflow metadata fixes such as preserving CreatedAt.

That is not one feature. It is a control-plane release wearing a package-bump jacket.

Skills are becoming capability management, not prompt decoration

The most important word in this release is not “agent.” It is “exposure.” Microsoft’s Agent Skills design uses a SKILL.md file with YAML frontmatter and markdown instructions, optionally bundled with scripts, references, and assets. The clever part is progressive disclosure: roughly 100 tokens per skill can be advertised up front, while the full instruction set and resources are loaded only when the agent actually needs them.

That sounds like a context-window optimization, and it is. But for practitioners, the better framing is capability management. A tool or skill the model cannot see yet is harder to misuse, harder for prompt injection to target, and easier to govern. Progressive exposure is how agent frameworks stop acting like giant menus and start acting like permissioned runtimes.

This matters more once MCP enters the picture. MCP-based skill discovery means capabilities can arrive from outside the local codebase. That is useful because teams do not want to hand-wire every tool and integration forever. It is also dangerous because discovered capabilities are dependencies, not decorations. A malicious or sloppy skill can change agent behavior through instructions; a bundled helper script can do real work; a reference file can bias decisions in ways that never show up in a package manifest.

Microsoft’s own guidance is refreshingly blunt here: treat skills like open-source dependencies because skill instructions enter the agent context and can influence behavior. That warning should be printed on the box of every agent framework. If your agent can discover tools over MCP, your threat model now includes provenance, version pinning, tool descriptions, prompt-injection surfaces, and whether a newly discovered capability should be visible before policy has approved it.

The boring enterprise pieces are the product

Microsoft Agent Framework is increasingly interesting because it is not trying to win the “coolest agent demo” contest. It is trying to own the boring path from prototype to production inside Microsoft-heavy organizations: .NET and Python APIs, Foundry hosting, Durable Task-style workflows, OpenTelemetry, declarative YAML agents, DevUI, A2A, MCP, hosted agents, and migration paths from Semantic Kernel and AutoGen.

That positioning matters. The agent ecosystem is already fragmented: LangGraph for graph-heavy Python teams, Pydantic AI for typed Python workflows, CrewAI for role/task abstractions, OpenAI Codex and Claude Code for coding-agent surfaces, plus a growing pile of MCP servers and local harnesses. Microsoft does not need every developer to decide MAF is cooler. It needs enterprise teams to decide MAF is easier to operate, observe, and govern than stitching together five libraries and a wish.

The 1.9 release leans into that. Foundry Adaptive Evals integration says evaluation belongs close to the runtime, not as a separate notebook ritual. Observability serialization and OTLP endpoint fixes say traces need to survive contact with production telemetry stacks. Hosted MCP call/result persistence says tool interactions are operational records, not transient chat artifacts. Background agent support says not every useful agent run is a foreground conversation with a human watching the spinner.

Those are the seams senior engineers should inspect. Can you reconstruct which skill was advertised, which one was loaded, which MCP tool was called, which approval policy applied, what the hosted agent persisted, and what trace was emitted? If the answer is no, you do not have a production agent system. You have a prompt that occasionally touches infrastructure.

What teams should do with this release

If you are already in the Microsoft ecosystem, MAF 1.9 deserves a practical spike. Do not evaluate it by asking whether it can produce an impressive demo in 20 minutes. Every framework can do that now. Evaluate the runtime contract.

Start with skills. Build a small internal skill source and require provenance metadata, owners, versioning, and review. Log when a skill is advertised versus fully loaded. Separate trusted filesystem skills from experimental MCP-discovered skills. Do not let skill discovery automatically imply tool execution permission. If a skill includes scripts, treat those scripts like code shipped into your production path, because that is what they are.

Then test progressive tool exposure against actual attack scenarios. Put an untrusted document in context that tries to invoke a privileged tool by name. Confirm the model cannot call tools that have not been exposed. Confirm per-tool approval modes are enforced outside the model’s natural-language reasoning. Confirm audit logs show the policy decision, not just the model’s explanation of the decision. The explanation is useful. The enforcement point is what saves you.

Finally, test the runtime plumbing. Run a background agent, persist MCP call/result records, emit traces through your OTLP pipeline, and attach Foundry evals to the workflow. Break things deliberately: malformed tool outputs, unavailable MCP servers, denied approvals, restarted workers, and missing credentials. Agent frameworks reveal their maturity in failure handling, not happy-path demos.

The tradeoff is complexity. MAF is not the lightest choice for a two-script internal assistant. If your team is Python-native, OSS-first, and already deep in LangGraph or Pydantic AI, switching for a point release would be silly. But even if you do not adopt MAF, the pattern is worth copying: capabilities should be progressively disclosed, skills should be treated as dependencies, tool exposure should be governed by runtime policy, and observability should make agent actions reconstructable after the fact.

The broader story is that agent frameworks are converging on the same uncomfortable truth. The model is not the system. The system is the runtime around the model: tools, skills, approvals, identity, traces, evals, hosted state, and the policies that decide what becomes visible when. Microsoft Agent Framework 1.9.0 is a useful signal because it moves more of that responsibility into framework plumbing.

That is where it belongs. The safest tool is still the one the model cannot see until context, policy, and provenance say it should.

Sources: Microsoft Agent Framework 1.9.0 release, Microsoft Agent Framework repository, Microsoft Agent Skills overview, Microsoft Agent Framework 1.0 announcement, Microsoft Agent Governance Toolkit MCP extensions

Skills are becoming capability management, not prompt decoration

The boring enterprise pieces are the product

What teams should do with this release

Sign up for more like this.