agentic-coding

Google Cloud Next 2026: The Infrastructure Behind the Agentic Era Takes Shape

Anatoliy Kolodkin

05 May 2026 • 5 min read

SiliconANGLE's theCUBE analysts spent three days at Google Cloud Next 2026 covering infrastructure announcements, and if you filter out the vendor keynotes and analyst enthusiasm, what's left is a specific and concrete picture of how the hyperscalers are positioning themselves for an agentic computing era. The through-line from their recap: agentic AI is not primarily a model problem — it's an infrastructure problem, and the companies that own the control plane are the ones who'll capture the value.

That's a perspective worth taking seriously, because the developer press — LGTM included — spends most of its time on model comparisons and benchmark wars. The infrastructure story is harder to cover and less viral, but it's what determines how fast agentic coding spreads in enterprise environments and what the deployment context looks like when it does.

Kubernetes as the operating system nobody announced

Google's framing at Next was that Kubernetes has become the de facto OS for AI — from training to inference to reinforcement learning. That's not a product announcement; it's an architectural observation. Drew Bradstock from Google's GKE team put it plainly: "We're finding ourselves having to adapt Kubernetes quite quickly, even faster than the open-source community can keep up." The implication for platform engineers is immediate: if Kubernetes is the OS, then the skills that matter are not just "can the agent write code" but "can the agent interact correctly with a Kubernetes API." That changes how agentic coding tools should be evaluated, and how CI/CD pipelines for AI-generated code need to be designed.

The concrete number that should catch infrastructure teams' attention: Google claimed GKE can now deploy 300 agent sandboxes per second per cluster, with sub-second time to first instruction. The average enterprise deployment in 2025 was fewer than 10 sandboxes per minute. If 300 per second per cluster is achievable today — and this is a Google marketing claim, so take it accordingly — the implication is that per-request agent spawning, rather than persistent agent sessions, becomes economically viable. Each invocation starts from a clean state, which changes the security model significantly: a compromised sandbox doesn't persist, and the blast radius of a bad agent action is bounded by the sandbox lifetime rather than the session lifetime.

The context problem nobody talks about

Sailesh Krishnamurthy, Google Cloud's VP Databases, articulated something the model-centric AI press keeps missing: "The models are amazing. They can do a lot of work, but they don't have all the context. The context is in the data." This is the context engineering thesis that several practitioner-focused publications have been building toward over the past six months — the idea that the bottleneck for agentic AI isn't model intelligence but data retrieval and organization.

The specific infrastructure bet Google is making: graph traversal, vector embeddings, full-text search, and relational operations all in one system, with the intelligence to surface the right data without requiring it to move between environments. This isn't a new idea — it's a modern restatement of the "data fabric" concept that has circulated in enterprise IT for a decade. What's different now is the economic pressure: when every LLM API call costs tokens, and when token costs scale with the data volume you send to the model, the economics of moving data around become a first-order engineering concern rather than a platform architecture curiosity.

The teams that figure out selective, intelligent context injection — surfacing only the specific signals the agent needs for the task at hand, not comprehensive system dumps — will have agents that actually help during incidents. The teams that feed entire database schemas, full metric exports, and complete log streams into agent context windows will burn through token budgets and hit context window limits without getting better answers. This is context architecture work, not just configuration, and it requires a different skill set than either traditional backend engineering or prompt engineering.

The partner ecosystem as the delivery mechanism

Google announced a $750 million partner ecosystem commitment targeting 120,000+ members across the Google Cloud Partner Network, specifically for what they call agentic delivery chains — agents from partners talking to agents from the Google Cloud Partner Network. That's an unusually specific number for a cloud partner program, and the framing is deliberate: Google is betting that enterprise agentic AI won't be delivered as a standalone product but as a chain of integrated capabilities across multiple vendors.

This is the same architecture thesis that Anthropic is pursuing with its technology partner integrations for Claude Security — CrowdStrike, Microsoft, Palo Alto, SentinelOne, and Wiz all integrating Opus 4.7 into existing security platforms. The difference is that Google is positioning the partner layer as a primary differentiator rather than a secondary distribution channel. For enterprise buyers, this means evaluating an agentic AI deployment isn't just "which model are we using" — it's "which partner ecosystem do we build on, and what does the integration surface look like." That changes the procurement conversation from vendor selection to architecture selection.

The x86 efficiency story as a budget reallocator

One detail from theCUBE's coverage that should catch FinOps teams' attention: AMD-based instances on Google Cloud are enabling enterprises to fund agentic AI investments from infrastructure savings. Sabre migrated 50,000 virtual CPUs with "zero code changes" and used the freed budget to invest in agentic AI. This is the practical mechanism by which legacy infrastructure cost optimization becomes a path to AI adoption — not a new budget line, but a budget reallocation from old compute to new. For platform teams that have been trying to fund AI initiatives from existing cloud budgets, this is both a proven playbook and a reminder that the unit economics of cloud computing are still in enough flux that optimization-driven reallocation is still possible.

The token economics point John Furrier raised — that tokens are becoming a new kind of corporate currency restructuring how teams operate — is the macro observation that ties the infrastructure story together. When cloud spend was infrastructure (compute, storage), it was a line item. When cloud spend becomes agentic — every API call, every context window, every tool invocation metered in tokens — it becomes a variable cost that touches every team and every workflow. The organizations that figure out token governance first will have a structural advantage over those still treating AI budgets as research spending rather than operational infrastructure.

What this means for builders

The Google Cloud Next recap is not a coding story, but it's the infrastructure story that determines the deployment context for every agentic coding tool built for enterprise use. Three concrete implications for the LGTM audience:

First, if Kubernetes is the OS of the agentic era, platform engineers need to be in the conversation about AI coding tools, not just developers. The skills that matter — cluster management, sandbox isolation, resource scheduling — are platform engineering skills, not frontend or backend development skills. Teams that have platform engineers who understand agentic workflows will have a meaningful advantage over those where AI coding is purely a developer tooling conversation.

Second, the per-second sandbox deployment rate is a leading indicator of where security architecture is heading. Sandboxes that start clean and dispose cleanly are architecturally different from persistent sessions with broad system access. The teams that design their agentic workflows around ephemeral, scoped execution — rather than long-running sessions with elevated privileges — will be better positioned when enterprise security teams start asking hard questions about agentic AI attack surfaces.

Third, the context engineering discipline — organizing, governing, and tagging enterprise information with metadata and business context before it reaches the LLM — is not a nice-to-have. It's the difference between an agent that operates on comprehensive but undifferentiated data dumps and an agent that receives exactly the signal it needs to take the right action. This is unglamorous data engineering work, but it's the foundation that everything else builds on.

The hyperscalers are betting heavily on agentic AI, and they're backing those bets with infrastructure commitments that will shape the market for years. Knowing which direction they're pushing — and why — is knowing where the market is going.

Sources: SiliconANGLE, Google Cloud Blog, The New Stack

Sign up for more like this.