azure-ai

Azure AI Foundry's New Observability Guide Is the Most Practical Comparison of Agent Frameworks You Will Find This Week

Anatoliy Kolodkin

29 Apr 2026 • 4 min read

Most vendor documentation about agent frameworks is written to make the vendor's choice look obvious. The observability comparison Microsoft published this week does something more useful: it tells you the truth about what each framework actually requires from your engineering team, and the answer varies enough to matter.

The post, from the Azure Infrastructure Blog, breaks down how Microsoft Agent Framework, Semantic Kernel, LangChain, LangGraph, and the OpenAI Agent SDK each handle observability inside Azure AI Foundry. The comparison is organized around five specific dimensions: agent decision flow visibility, tool invocation tracing, multi-agent support visibility, automatic span hierarchy, and code changes required. What makes it worth reading is that the answer is not the same across all five dimensions — and the differences are exactly the ones that will determine how much custom instrumentation work your team owns.

The Gap Between "Works" and "You Will See It Working"

Microsoft Agent Framework emits full observability automatically inside Foundry. Agent decision flows, tool calls, multi-agent interactions, token usage, latency, and errors — all captured without code changes. That is the headline, and it is also the ceiling for the comparison. If you build on MAF, you get production-grade observability as a platform feature. Your SRE team can see what the agent decided, why it chose a particular action, which step came first, and how it reached the final answer. Multi-agent traces are fully supported; a single request that moves between agents stays one correlated trace.

Semantic Kernel sits in a useful middle ground. It emits telemetry for prompts, responses, function calls, token usage, and latency through Azure inference connectors. The setup is not automatic — you need to attach the connector and configure the environment — but it is low-code, not pro-code. Multi-agent visibility is partially supported: the framework can handle multi-agent flows, but cross-agent visibility may require manual correlation that MAF handles automatically. For teams already committed to .NET or Python who want structure without full managed hosting, this is a reasonable tradeoff.

The LangChain story is honest about the overhead. Azure AI Foundry integrates with LangChain through an OpenTelemetry-based tracer in the langchain-azure-ai package. Once configured, you get consistent observability. But the phrase "requires explicit configuration" in a vendor comparison table is a polite way of saying "this is your team's problem until you finish the setup." The same applies to LangGraph, which uses the same tracer but adds configuration complexity for graph-based stateful workflows. The benefit is that span hierarchy is preserved across graph nodes, which makes execution order clearer in complex branching scenarios.

The OpenAI Agent SDK Admission Is More Important Than It Looks

The most notable thing in the comparison is what Microsoft says about the OpenAI Agent SDK: no built-in telemetry. Zero. "You must manually instrument OpenTelemetry spans and export them to Application Insights yourself." There is no automatic span hierarchy. Multi-agent tracing is not supported by default.

Microsoft is essentially publishing a comparison that says "if you build here, you own all the observability work." That is a remarkable thing for a vendor to include in official documentation, and it tells you something about where Microsoft thinks the market is heading. The OpenAI Agent SDK is for teams that need maximum low-level control over agent execution and are willing to pay the operational cost. But those teams should budget engineering time for custom observability as a first-class deliverable, not an afterthought.

This matters beyond the framework comparison because it illustrates a broader pattern in enterprise AI adoption. The gap between "our agent works in a demo" and "our SRE team can see what our agent is doing in production" is not a small gap. It is the gap that determines whether your agent system survives contact with a real on-call rotation. Teams that treat observability as a production requirement — not a nice-to-have — will make different framework choices than teams that treat it as a post-deployment concern.

What the Comparison Tells You About Azure Foundry's Direction

The broader signal is that Azure AI Foundry is maturing into a real operations platform. The integration with Azure Monitor, Application Insights, Managed Identity, and RBAC means agent workloads are being treated as first-class Azure resources — not AI experiments bolted onto the side. The observability comparison is part of that story: Microsoft is publishing the kind of detailed framework comparison that platform engineering teams actually need when making architectural decisions, and it is publishing it through official channels rather than waiting for an analyst report or a community post.

For teams evaluating agent architecture today, the comparison is a useful decision artifact even if you are not deploying on Azure. The five dimensions — decision flow visibility, tool invocation tracing, multi-agent support, span hierarchy, and code changes required — are the right questions to ask of any agent framework in any environment. The fact that Microsoft Agent Framework wins on all of them inside its own platform is expected. The fact that the comparison is honest about where the other frameworks fall short is more useful than marketing material that pretends the gaps do not exist.

The practical takeaway is straightforward: if you are building on Microsoft Agent Framework inside Foundry, you are getting the most operationally mature observability story available in that environment. If you are building on LangChain or LangGraph, the tracer exists and is functional, but you are accepting some configuration overhead. If you are building on the OpenAI Agent SDK, treat custom observability as a first-class engineering workstream, not a phase two deliverable. That is the kind of advice that only gets published when a vendor is confident enough in its own answer to be honest about everyone else's gaps.

Sources: Microsoft TechCommunity Azure Infrastructure Blog, Microsoft Learn — AG-UI, Microsoft Agent Framework GitHub

The Gap Between "Works" and "You Will See It Working"

The OpenAI Agent SDK Admission Is More Important Than It Looks

What the Comparison Tells You About Azure Foundry's Direction

Sign up for more like this.