ai-frameworks

Google Managed Agents Turns Agent Framework Plumbing Into a Cloud API

Anatoliy Kolodkin

20 May 2026 • 5 min read

Google’s Managed Agents announcement is not another framework release wearing a cloud logo. It is Google making a more aggressive claim: the agent loop itself belongs in managed infrastructure.

That is a meaningful shift. For the last two years, most teams building agents have assembled the same fragile stack over and over: a model API, a tool-calling loop, a filesystem, a sandbox, retry logic, state, tracing, permission checks, cancellation, and some optimistic glue code that everyone promises to clean up after the demo. Managed Agents in the Gemini API tries to collapse that pile into a single cloud primitive: call an API, get an Antigravity-powered agent, let it reason, browse, execute code, manage files, and work inside an isolated Linux environment.

The announcement lands as part of Google I/O 2026’s broader developer push around Gemini 3.5 Flash, Antigravity 2.0, the Antigravity CLI and SDK, Google AI Studio, Gemini Enterprise Agent Platform, and the newer Interactions API. Google says the first general-purpose managed agent is the Antigravity agent, built on Gemini 3.5 Flash and the same harness as Google Antigravity. In the docs, the preview model appears as antigravity-preview-05-2026, available through the Interactions API.

The product pitch is simple enough to be dangerous: instead of wiring your own agent runtime, provision one from Google.

The runtime is the product now

The important detail is not that Antigravity can call tools. Every serious agent framework can call tools. The important detail is that Google is bundling the harness, environment, and session model into the API surface. Managed-agent interactions can provision a fresh remote environment, reuse an existing environment by ID, or accept a supplied environment configuration. Reused environments preserve files and state. Environments are Ubuntu-based, include Python 3.12 and Node.js 22, and are deleted after seven days of inactivity.

That sounds like infrastructure because it is infrastructure. A coding or research agent that can run shell commands, write files, browse the web, and maintain session state is no longer “chat with tools.” It is a short-lived worker with agency. Once you see it that way, the competitive map changes. Google is not merely trying to beat LangChain, CrewAI, AutoGen, Mastra, Agno, or homegrown orchestration libraries at Python ergonomics. It is trying to make the operational substrate of agents feel like a cloud service.

That will be attractive to teams that have already discovered the unglamorous cost of agent infrastructure. The demo loop is easy. The production loop is where optimism gets paged at 2 a.m. You need environment lifecycle management, reproducible sandboxes, state inspection, tool policy, spend limits, retries, cancellation, observability, auth boundaries, and a way to prove what the agent actually did. Most product teams do not want to maintain all of that. They want to ship a workflow.

Managed Agents is Google’s answer: stop building the harness; rent the harness.

The fine print is where engineering teams should start

Google’s own documentation gives builders two numbers that should slow everyone down before the first production integration. A single interaction may consume roughly 100,000 to 3 million tokens, and environment compute is billed separately. That is not a typo-sized range. It is a reminder that agent workloads are bursty, recursive, and hard to forecast from prompt length alone.

The second detail matters even more: managed-agent environments have unrestricted outbound network access by default unless network rules are configured. For a preview product, that is understandable. For a production workflow touching source code, customer data, credentials, internal docs, or deploy tooling, it is a giant blinking sign that says “policy goes here.”

Managed does not mean governed. It means someone else operates part of the stack. The team using the agent still owns the blast radius.

If an agent can browse, execute code, write files, call tools, and preserve state across runs, then network policy, secrets handling, approval gates, audit logs, and cost controls are not enterprise afterthoughts. They are table stakes. A managed sandbox without outbound restrictions is useful for experimentation and risky for anything sensitive. A reusable environment without clear state controls is convenient until yesterday’s files become today’s accidental context. A token-heavy interaction is fine for a valuable task and absurd for a badly scoped one.

This is where many agent announcements lose the plot. They market capability; practitioners need containment. The correct first question is not “Can Antigravity finish this task?” It is “What can Antigravity reach while trying?”

Gemini 3.5 Flash matters, but the harness matters more

Google is also using Managed Agents to push Gemini 3.5 Flash as the engine for agentic workflows and coding. The published benchmarks are strong signals: 76.2% on Terminal-Bench 2.1, 1656 Elo on GDPval-AA, 83.6% on MCP Atlas, 84.2% on CharXiv Reasoning, and output tokens per second that Google says are four times faster than other frontier models. Gemini 3.5 Flash is generally available across Google Antigravity, the Gemini API in AI Studio and Android Studio, Gemini Enterprise Agent Platform, Gemini Enterprise, the Gemini app, and AI Mode in Search.

Those numbers matter, especially Terminal-Bench and MCP Atlas for builders evaluating coding and tool-use work. But models do not ship agent systems by themselves. The same strong model inside a sloppy runtime will still produce expensive ambiguity. The harness decides how tools are exposed, how state is persisted, how failures are surfaced, how intermediate reasoning and actions are observed, and how humans regain control when the agent goes sideways.

That is why the Antigravity branding is everywhere. Google knows the model is not the entire moat. The runtime is where developer trust will be won or lost.

For teams already using local orchestration libraries, the right comparison is not “framework versus Google” in the abstract. It is local orchestration versus managed agent runtime for each workload. A framework gives you portability, deep customization, provider independence, and the ability to run your own sandboxes. A managed agent gives you faster time-to-runtime, fewer moving pieces, and first-party integration with Google’s model and environment stack. The trade is control for compression.

That trade may be correct for many jobs. It is probably wrong for some. The only honest answer comes from measuring the boring surfaces: cancellation behavior, trace quality, environment reuse, file persistence, network restriction, auth integration, secrets isolation, tool approval flows, and cost per successful task.

What builders should actually do

Treat Managed Agents as a preview runtime worth benchmarking, not a magic abstraction you wire directly into production because the keynote made it look clean. Start with contained workloads: repo analysis on disposable copies, research tasks with non-sensitive sources, code generation inside throwaway environments, or workflow prototypes where the output is reviewed before use.

Set network rules before connecting sensitive tools. Measure token consumption per completed task, not per prompt. Test cancellation under bad conditions: long-running shell commands, tool failures, partial outputs, and ambiguous instructions. Verify what happens when an environment is reused, when it expires after inactivity, and when files are left behind. Compare the managed sandbox against self-hosted options like E2B-style or Daytona-style execution if portability and policy control matter to your team.

Also watch the Interactions API itself. Google’s docs position it as the new standard for Gemini development, optimized for server-side state management and complex multimodal, multi-turn agent workflows. They also say new agentic capabilities and tools will launch exclusively there. That is a product strategy, not just an API migration. If you adopt Managed Agents, you are also adopting Google’s preferred agent surface.

My take: this is one of the more consequential agent-platform moves of the week because it shifts the argument from libraries to runtimes. Google is betting that agent orchestration becomes cloud infrastructure. That can save teams from maintaining brittle loops, but only if they treat the managed agent like a privileged worker with policy requirements — not like a smarter chat completion endpoint.

Sources: Google Managed Agents announcement, Google I/O 2026 developer highlights, Gemini 3.5 announcement, Antigravity agent docs, Interactions API docs

The runtime is the product now

The fine print is where engineering teams should start

Gemini 3.5 Flash matters, but the harness matters more

What builders should actually do

Sign up for more like this.