PydanticAI v1.89.0 Turns Cross-Run Correlation and Dynamic Capabilities Into First-Class Concerns

PydanticAI v1.89.0 Turns Cross-Run Correlation and Dynamic Capabilities Into First-Class Concerns

There is a version of agent framework maturity that looks like new features — more tools, more models, more sample apps. Then there is a version that looks like the kind of work you only do when you have real agents in real systems touching real data. PydanticAI v1.89.0, shipped May 1, is the second kind. The headline additions are three features that individually look modest. Together they tell a coherent story about what production agent reliability actually requires: the ability to correlate behavior across runs, gate capabilities at runtime, and control tool exposure with surgical precision.

The headline feature is conversation_id for cross-run correlation, merged as PR #5251. This is exactly the kind of plumbing that sounds boring until your monitoring stack needs it. In a production agent system, a single user interaction often produces multiple underlying LLM runs — retries, subagent calls, tool invocations that branch into their own reasoning loops. Without a shared correlation identifier stitched through all of them, you get fragmented traces: three separate run records that you have to manually reconstruct into a story. conversation_id makes that stitching automatic. The practical effect is that your observability tooling can now stitch together a complete picture of what happened across a multi-turn interaction without you having to build the correlation logic yourself. For teams using LangSmith, Logfire, or any custom tracing layer, this is not a nice-to-have. It is the difference between a trace and a story.

The second feature — dynamic capabilities via callables, in PR #5252 — is more architecturally interesting because it changes the question you ask about agent permissions. Before this change, you declared an agent's capabilities statically: here is what this agent can do, set at construction time, fixed for the lifetime of the process. Dynamic capabilities via callables mean you can now pass a function that evaluates the runtime context — user identity, request content, system state — and returns a capability decision on the fly. This is a meaningful shift from "capabilities are a deployment decision" to "capabilities are a runtime policy." For builders building multi-tenant agent systems, or agents that serve users with different permission levels, this is the difference between deploying separate agent instances per trust level and running one agent whose permissions gate themselves based on who is asking. The callable pattern is also cleaner than branching inside tool implementations, because it keeps the policy decision separate from the tool logic.

The third feature — builtin_tools in agent.override, from PR #5248 — addresses a specific pain point that surfaces in production but rarely in demos. PydanticAI ships with built-in tools: file search, web fetch, and the other utilities that most agents need. The problem is that production deployments often want some of those tools but not all of them. Until now, the override mechanism let you add or replace tools, but did not give you a clean way to specify exactly which built-in tools should be active in a given agent configuration. builtin_tools in override adds that control. You can now say: I want this agent to have file search but not web fetch, without having to build a custom tool list from scratch or subclass the agent to strip things out. It is a small ergonomics improvement that becomes important the moment you stop running one-size-fits-all agent configurations.

The bug fix — non-daemon threads for background evaluators, in PR #5247 by Alex Mojaki — is worth understanding in context. Python's daemon thread behavior means daemon threads get killed abruptly when the main process exits. If your evaluation runners were daemon threads, a fast-executing main process could terminate eval work mid-computation, silently losing results. The fix switches background evaluators to non-daemon threads, which means eval work completes before process exit. For teams running batch evaluation pipelines or background eval inside longer-running services, this is the kind of fix that prevents subtle data loss that would be very difficult to debug from the outside.

What makes v1.89.0 worth writing about is not any single addition. It is the pattern. These three features — correlation IDs, dynamic capability gates, and tool exposure control — are all about the same thing: making agents behave predictably and observably in multi-run, multi-tenant, production environments. They are not impressive in a demo. They are impressive in a system that has been running for six months and suddenly needs to answer the question "why did this agent do that, for this user, in this session?" PydanticAI has always sold on type safety. The deeper claim is that type safety extends to runtime behavior, not just input/output shapes. v1.89.0 advances that claim by making correlation, capability policies, and tool configuration first-class configurable concerns instead of things you duct-tape together after the framework stops being enough.

For builders, the practical next step is straightforward: if you run PydanticAI agents in production, evaluate whether conversation_id could improve your tracing story, whether dynamic capability callables could replace whatever branching logic you currently use to gate tool access, and whether builtin_tools in override could simplify your agent configuration management. None of these are urgent upgrades on their own. Together they represent the kind of framework maturation that tends to compound — once you have correlation IDs, your eval pipelines get better; once you have dynamic capabilities, your multi-tenant story gets cleaner; once you have tool exposure control, your security review process gets simpler. The unglamorous work is what durable systems are made of.

Sources: GitHub releases, PR #5251, PR #5252, PR #5248, PR #5247