Pydantic AI 1.93 Gives Builders Explicit Tool Choice — Because Agent Control Should Not Be a Side Effect

Pydantic AI 1.93 Gives Builders Explicit Tool Choice — Because Agent Control Should Not Be a Side Effect

Pydantic AI 1.93 looks like a minor framework release until you notice what it is actually moving: tool authority out of model vibes and into application policy. That is the right direction. Agent frameworks spent the last two years making tools easy to register; the next two will be about making them safe, observable, and boring enough to run in production without a human squinting at every trace.

The release adds explicit tool_choice, separate events for output-tool calls and results, and safer cancellation cleanup for spawned tasks. In ordinary changelog language, those are three implementation details. In a production agent system, they map almost exactly to three failure classes teams keep rediscovering: models using the wrong tools, traces that cannot reconstruct what happened, and background work surviving after the request that launched it has died.

Tool choice is policy, not prompt decoration

The most important change is tool_choice in ModelSettings. Pydantic AI now supports values such as auto, required, none, and explicit lists like ['tool_a', 'tool_b']. That sounds small if you think of tools as an LLM feature. It is not small if you think of tools as authority: the ability to read state, touch external systems, spend money, trigger workflows, or leak context through a badly scoped API.

Before this release, users sometimes had to tunnel provider-specific settings through escape hatches such as extra_body={'tool_choice': 'none'}. The pull request notes the obvious problem: that does not work everywhere. Provider-specific control is a poor place to encode application security policy. If one model backend respects it, another ignores it, and a third uses a different schema, your agent’s authority model depends on whichever adapter happened to run today. That is not engineering; that is hoping the plumbing shares your values.

PR #3611 is large for a reason: 166 commits, 79 changed files, 9,298 additions, and 200 deletions. This is not just a type alias bolted onto a settings object. The implementation has to translate a shared policy concept into provider-specific API shapes, validate tool names, warn about conflicts, and preserve behavior across the framework’s model abstraction layer. That is exactly where this belongs. Application code should say “this run may call no function tools” or “this step must call one of these two tools.” The framework should make that portable.

There is an important nuance: tool_choice applies to registered function tools, not internal output tools used for structured output. If an agent has output_type=SomeModel and tool_choice='none', the output tool can still be available, with Pydantic AI warning about the distinction. That warning matters. Many frameworks implement structured output as a tool-shaped internal mechanism. If your policy says “no tools,” but your telemetry shows a tool call anyway, you need to know whether that was a side-effecting external action or the model submitting a final typed answer. Collapsing those two into the same bucket makes audits useless.

Tracing needs semantic boundaries, not just more events

That is why PR #5320 is more interesting than it first appears. Pydantic AI added OutputToolCallEvent and OutputToolResultEvent because successful output tool calls previously did not fire the same call/result events that downstream consumers expected. Systems rebuilding a run from event streams could end up with a dangling tool call: the trace said something started, but not that it completed correctly. Anyone who has debugged a production agent from incomplete traces knows how expensive that ambiguity becomes.

The release also introduces shared base classes, ToolCallEvent and ToolResultEvent, which lets telemetry consumers handle tool-like activity generically while still distinguishing user tools from structured-output tools. That is the correct design pressure. Observability should not force developers to choose between a high-level “all tool events look the same” abstraction and bespoke per-event plumbing. You want common handling for correlation, timing, and display, plus enough semantic detail to answer the audit question: did the model take an external action, or did it produce a typed answer?

This is where agent frameworks have to grow up. A nice trace viewer is not enough. Event streams become replay inputs, eval artifacts, incident timelines, and compliance records. If the stream cannot represent the difference between internal output machinery and side-effecting function calls, the system may look observable while hiding the only boundary that matters.

Cancellation is where demo agents go to become services

The third change, PR #5341, fixes cancellation cleanup for spawned tasks. Pydantic AI now uses cancel_and_drain so wrapper tasks and parallel tool tasks are cancelled and drained before the original cancellation is re-raised. This is not glamorous, but it is one of the strongest signals that a framework is being used as infrastructure rather than a notebook toy.

Real agent runs spawn work. They stream responses, call parallel tools, keep MCP sessions alive, run wrappers, and often hold onto network resources or temporary files. Cancellation is not an edge case; it is how timeouts, user interrupts, deploy shutdowns, and queue rebalancing show up. If a parent request is cancelled but child tasks keep running, you get leaked work and confusing side effects: tool calls that complete after the user gave up, traces that close before the work ends, and cleanup code that races against the next run.

The context from Pydantic AI 1.92 reinforces the pattern. One day earlier, the project added Anthropic task budget support, runtime output_retries, MCP session task isolation, streaming-response cleanup on cancellation, and eval lifecycle teardown guarantees. Pydantic AI is not merely racing to add model adapters. It is tightening the run loop: budgets, retries, cleanup, isolation, and now explicit tool authority.

For teams already using Pydantic AI, the upgrade checklist is straightforward. First, audit any place where prompt instructions or provider-specific extra_body settings were being used to control tool use. Move that into tool_choice so the framework can validate it and make it portable. Second, update trace consumers to understand output-tool events separately from function-tool events. Third, add cancellation tests around long-running tools and parallel execution paths. The framework can drain tasks correctly; your own cleanup hooks still need to behave.

The broader lesson is bigger than Pydantic AI. Agent control should not be a side effect of how a model happens to interpret a prompt. It should be explicit configuration, visible in traces, testable in isolation, and enforced before the model gets a chance to improvise. Pydantic AI 1.93 is a small release number attached to a serious architectural correction. LGTM.

Sources: Pydantic AI v1.93.0 release, PR #3611, PR #5320, PR #5341