OpenAI Agents JS 0.11.5 Makes Resumed Runs and Traces Less Private-API Fragile

OpenAI Agents JS 0.11.5 Makes Resumed Runs and Traces Less Private-API Fragile

The most revealing agent-framework releases are rarely the ones with the cleanest diagrams. They are the ones that admit production agents are messy: runs resume in different processes, traces cross callback boundaries, tools disappear between serialization and execution, realtime sessions temporarily disconnect, and integrations reach into private internals because the public API does not expose the thing operators actually need. OpenAI Agents JS v0.11.5 is useful because it moves several of those hacks into supported surface area.

The release is not selling a new abstraction. It is tightening the runtime contract around trace context, run state, handoffs, missing-tool recovery, usage accounting, and realtime WebRTC lifecycle. That sounds like plumbing because it is plumbing. But plumbing is what separates a demo loop from an agent system a team can observe, replay, resume, and debug without spelunking through private async-storage fields.

OpenAI describes Agents JS as a production-ready successor to Swarm, built around a small primitive set: agents, sandbox agents, agents-as-tools and handoffs, guardrails, function tools, MCP tool calling, sessions, human-in-the-loop, tracing, and realtime agents. That list is broad enough to be dangerous if lifecycle boundaries are fuzzy. v0.11.5 is therefore less about expanding what an agent can do and more about making the existing runtime less fragile when embedded in real applications.

Trace context is an API, not an implementation detail

The trace changes are the center of the release. PR #1344 adds supported tracing ID generation so deterministic runtimes can provide replay-safe trace and span IDs through TraceProvider.generateTraceId(), generateSpanId(), and setTracingIdGenerator(). Explicit IDs take priority, and the default generator is frozen. That is the right shape: observability systems often need stable IDs for replay, test determinism, cross-process correlation, or compliance capture, but the SDK should not require users to monkey-patch internals to get them.

PR #1347 adds getCurrentTraceContext() and withTraceContext(...), giving integrations a public way to capture, restore, overlay, or clear trace/span context around callbacks. This is one of those APIs that looks minor until you have a background job queue, an HTTP request boundary, a UI event, and a streaming callback all trying to keep attribution intact. Without public helpers, teams copy private async-storage shapes and then get surprised when an SDK upgrade breaks their observability.

PR #1356 adds RunState.clearTrace(), which addresses a subtle but common resumed-run problem. Serialized run state may contain an old trace identity. The caller resuming that state may want the resumed work attributed to a new request, job, user action, or trace root. Clearing restored trace data lets Runner.run bind the resumed state to the current ambient trace instead of poisoning the new run with stale identity. That is not a cosmetic concern; misattributed traces turn incident review into archaeology.

The related lifecycle dispatch helpers from PR #1355 let tracing processors and providers replay already-completed trace and span objects while preserving timestamps and lifecycle state. That matters for integrations that buffer events, bridge traces between processes, or export after completion. A trace system that only works in one synchronous call stack is not enough for durable agents.

Missing-tool behavior is another useful maturity signal. PR #1336 adds toolNotFoundBehavior to RunConfig. The default remains raise_error, which is correct. Unknown tools should fail loudly unless the application deliberately chooses otherwise. But callers can now opt into return_error_to_model, which records the missing function call, emits a model-visible tool error, carries missing-tool state through RunState serialization, and lets the agent continue in streaming and non-streaming paths.

That option is easy to abuse. It is not a license to let tool registries drift casually. But it is valuable in long-running systems where tool availability can legitimately change between state capture, restore, routing, and execution. A durable agent may resume after a deploy, move across tenants, or run in an environment with a narrower tool set. Returning a structured error to the model can be a better operational choice than dropping the whole run, provided the trace records exactly what happened and policy decides where recovery is allowed.

Handoffs get a similar cleanup. PR #1348 adds a public Handoff#clone(...) API that preserves the model-facing tool contract — name, description, input schema, strictness, filters, enabled predicate, and callback behavior — while allowing runtime target, callback, metadata, schema, and availability overrides. That distinction matters. Handoffs are not just JavaScript objects. They are part of the contract the model sees. Reconstructing them by hand risks changing the prompt-visible interface while trying to change only runtime routing.

The realtime fix is small but practical. PR #1363 changes WebRTC disconnect handling so RTCPeerConnection.connectionState === "disconnected" does not immediately close the transport. The SDK now waits through a short grace period, cancels closure if the peer returns to connected, and still closes immediately on failed or closed. That maps better to browser and network reality. Voice agents do not run on laboratory networks; a transient disconnect should not always be terminal.

The usage-accounting fix matters for teams watching cost and quality signals. PR #1342 preserves output token details such as reasoning and text token counts in response usage and generation spans when integrating through the AI SDK path. Aggregate output tokens are not enough if an observability stack uses reasoning-token counts to detect cost regressions, compare model behavior, or explain why a workflow suddenly became expensive.

For practitioners, the recommendation is straightforward. Upgrade if you are building on Agents JS and especially if you persist RunState, bridge traces across workers, use handoffs as routing primitives, connect through Vercel AI SDK surfaces, or ship realtime agents. Replace any private trace-context access with getCurrentTraceContext() and withTraceContext(...). Decide explicitly whether missing tools should be fatal or model-visible recoverable errors. Add tests that resume a serialized run under a fresh trace and verify attribution. Add a WebRTC transient-disconnect test if realtime is part of your product surface.

The editorial read is that agent SDK maturity is moving away from prettier loops and toward operational affordances. The hard question is no longer “can the agent call a tool?” It is “can we resume the run, preserve the audit trail, recover from drift, keep the UI alive through normal network failure, and explain what happened afterward?” v0.11.5 answers part of that question by making fewer integrations depend on private contracts. Good. Private contracts are where production systems go to quietly rot.

Sources: OpenAI Agents JS v0.11.5 release, OpenAI Agents JS docs, PR #1336, PR #1344, PR #1347, PR #1356, PR #1363.