ai-frameworks

Mastra 1.41 Turns Sandboxes Into a Multi-Tenant Runtime Decision

Anatoliy Kolodkin

06 Jun 2026 • 6 min read

Mastra 1.41 is the kind of release that looks boring until you have actually tried to ship an agent product with more than one customer. Then it looks less like plumbing and more like the boundary between a demo and an incident report.

The headline change in @mastra/core 1.41.0 is per-request Workspace sandbox resolution. In plain English: a single Mastra Workspace can now decide, at request time, which sandbox should run commands for this user, tenant, role, or thread. The old shape was static: one Workspace, one sandbox. The new shape is WorkspaceSandbox | (({ requestContext }) => WorkspaceSandbox), which means sandbox selection can follow application identity instead of framework construction.

That matters because the serious question for command-running agents is no longer “can the agent execute shell commands?” We crossed that bridge, burned it, rebuilt it as an MCP server, and added a chat UI. The real question is whose environment those commands run in, which files are visible, how long background processes survive, whether prompts leak tenant-specific paths, and who cleans up after the sandbox when the conversation is over.

A sandbox is identity wearing a filesystem costume

Mastra’s docs define Workspaces as persistent agent environments for storing files and executing commands. They can expose filesystem tools like read, write, list, delete, copy, move, and grep; sandbox tools for shell commands and background processes; language-server inspection; search over indexed content; and reusable skills. That is a lot of power to hang off one configuration object. It is also exactly why per-request sandbox routing is not a niche feature.

In local development, pairing a LocalFilesystem with a LocalSandbox pointed at the same directory is convenient. In a multi-tenant SaaS product, it is a footgun if you do not make identity part of the runtime boundary. Mastra’s example is intentionally simple: pull user-id out of requestContext and return a LocalSandbox rooted at /workspaces/${userId}. The docs also show the resolver can be asynchronous, so an application can look up tenant configuration from a database before returning the sandbox.

That is the right abstraction. Tenant isolation is not a static import. It belongs at the point where a request arrives with identity, authorization, plan limits, region, data residency requirements, and maybe a workspace lease that already exists. If the framework forces teams to create one agent instance per tenant just to get isolation, teams will either overcomplicate the product or quietly share too much state. Neither outcome belongs in production.

The catch is that Mastra correctly pushes lifecycle ownership back to the caller. With a static sandbox, workspace.init() can call start(), and workspace.destroy() can call destroy(). With a resolver, the Workspace does not have a concrete instance at construction time. The resolver must return something ready to use, and the application owns cleanup. That cleanup might be per request, per tenant, per user, per session, or attached to a sandbox pool. This is not Mastra being lazy. It is Mastra refusing to pretend it can manage infrastructure it did not create.

The prompt-caching detail is the quiet smart part

The release includes one design choice that deserves more attention than the API signature: resolver-backed sandboxes contribute stable placeholder text to the agent’s system message by default. Mastra does not call the sandbox resolver merely to build workspace instructions. If teams want concrete per-request sandbox details in the prompt, they can opt in with instructions.dynamicSandbox: 'resolve' or provide a function that derives text from requestContext.

That sounds minor. It is not. If building a prompt provisions a caller-owned sandbox, prompt construction becomes a side-effecting infrastructure operation. Worse, every tenant-specific path, sandbox ID, or runtime detail can make prompts less cacheable and more likely to leak operational shape into the model context. Stable placeholder instructions keep the system message consistent and cache-friendly, while still letting teams expose concrete context when it is genuinely useful.

This is the agent-framework version of a mature API decision: do not make the expensive, stateful thing happen as an incidental side effect of preparing a request. Agent systems already have enough hidden state. The framework should not add more just because the model likes knowing where it is standing.

There are useful constraints, too. Mastra says sandbox resolvers are incompatible with mounts, which throws INVALID_CONFIG, and lsp: true is disabled with a warning. Both need a concrete sandbox at construction time. That limitation is annoying in the way correct limitations often are. Dynamic isolation, cloud filesystem mounts, and language-server introspection do not compose for free, and pretending otherwise would leave builders debugging invisible environment assumptions three weeks later.

Background processes are where abstraction leaks first

The second meaningful piece is sandboxCacheKey. Mastra sandboxes can run background processes through execute_command with background: true, then inspect or stop those processes later with get_process_output and kill_process. That only works if the later request reaches the same sandbox that started the process.

With dynamic sandboxes, “same sandbox” is not automatic. A later conversation turn may have a different RequestContext instance even though it belongs to the same user, tenant, or thread. Mastra’s answer is a stable cache key: for example, derive sandboxCacheKey from thread-id. Failed resolver calls are removed from cache so later calls can retry, and workspace.clearSandboxCache(cacheKey) lets lifecycle code drop references after externally destroying or replacing the sandbox.

This is exactly the kind of feature that will not trend on Hacker News and will absolutely save somebody’s production week. Background processes are the first place agent abstractions leak because they turn a stateless chat request into an operating-system lifecycle problem. Dev servers, test watchers, long-running crawlers, queued jobs, and language servers do not care that your product thinks in turns. “Hope the resolver returns the same thing” is not a model.

Practitioners should treat sandboxCacheKey as a design decision, not a convenience flag. Key it too broadly and tenants may share process state. Key it too narrowly and follow-up commands cannot find their processes. Key it to a durable thread or workspace lease when continuity matters, and make cleanup explicit. Also put quotas around it. Persistent sandboxes are useful right up until every abandoned chat owns a sleeping process and a slice of your cloud bill.

`untilIdle` admits streams are no longer token pipes

Mastra 1.41 also unifies “stream until idle” behavior behind an untilIdle option across core, server, and client SDKs. stream() and resumeStream() now accept untilIdle: true or { maxIdleMs }; server endpoints and client SDKs accept the same field. The older dedicated streamUntilIdle(), resumeStreamUntilIdle(), and matching endpoints remain available but are deprecated.

This is the right API cleanup because agent streams are no longer just token pipes. They carry tool events, background continuations, resumptions, UI state, and sometimes work that keeps going after the first response looks done. A single stream(..., { untilIdle: true }) shape is less error-prone than parallel endpoint families with almost identical semantics. It also forces the product team to acknowledge the operational question: how long should this connection stay open while the agent is still doing useful work?

The answer should not be “forever.” Teams using untilIdle need idle caps, cancellation paths, traceability, and resource accounting. Keeping a stream open across background continuations improves UX when the user expects live progress. It becomes a denial-of-wallet feature if every speculative task keeps sockets, workers, or sandboxes warm without limits.

The broader pattern is clear: Mastra is moving the framework discussion away from “agents versus workflows” and toward runtime contracts. File access, command execution, sandbox identity, process continuity, tool approvals, and cleanup have to be explicit surfaces.

For engineering teams evaluating Mastra against LangGraph, Pydantic AI, CrewAI, Microsoft Agent Framework, or ADK, this release is a useful filter. If your agent never runs commands, never touches tenant data, and never keeps background work alive, dynamic sandboxes may not matter yet. If you are building coding agents, data-analysis agents, internal automation agents, or customer-facing agents that can execute tools in user-specific environments, they matter immediately.

The practical checklist is straightforward. Route sandboxes from authenticated request context, not from model-visible text. Make lifecycle ownership explicit before launch: who starts sandboxes, who destroys them, and when cached references are cleared. Use stable cache keys for background processes, but scope them narrowly enough to avoid cross-tenant leakage. Keep prompt instructions stable unless per-request environment details are necessary. Require approval for side-effecting tools, especially command execution and writes. Put quotas on background processes and open streams. Log sandbox IDs, cache keys, tool calls, process lifecycle events, and cleanup outcomes so “the agent did something weird” becomes an investigation, not folklore.

Mastra 1.41 is not flashy. Flashy is how agent frameworks got stuck optimizing demos while production users were asking where Bob’s files went. This release is about the less glamorous truth: once agents can run commands, the sandbox is not an implementation detail. It is the runtime trust boundary. Treat it that way, or your architecture review will do it for you.

Sources: Mastra @mastra/core 1.41.0 release, Mastra Workspace docs, Mastra Sandbox docs, Mastra Workflows docs

A sandbox is identity wearing a filesystem costume

The prompt-caching detail is the quiet smart part

Background processes are where abstraction leaks first

untilIdle admits streams are no longer token pipes

Sign up for more like this.

`untilIdle` admits streams are no longer token pipes