google-ai

Your Agent Workflows Don't Have to Poll Anymore — Gemini API Webhooks Are Here

Anatoliy Kolodkin

06 May 2026 • 4 min read

For the past two years, every time a developer has submitted a long-running job to the Gemini API — a Deep Research task, a video generation job, a batch of documents to process — the standard answer was the same: poll until it's done. Check back in five seconds. Then five more. Then ten. Exponential backoff with jitter to avoid rate limit collisions, a retry loop to handle transient errors, and a monitoring mechanism to surface the job's state when something inevitably went wrong. This was not a secret. It was not a scandal. It was just how you built agentic systems with Google's API. It also created exactly the kind of fragile architectural scaffolding that breaks in production in ways that are genuinely hard to debug.

Google shipped event-driven webhooks in the Gemini API on May 4, and the practical effect is to make that polling pattern optional rather than mandatory. The implementation follows the Standard Webhooks specification — the same spec used by Stripe, GitHub, and a dozen other major API platforms. That decision to follow an existing standard rather than inventing Google's own signing scheme is the first signal that this was built by people who have dealt with webhook integrations before. If you've ever verified a Stripe webhook signature or authenticated a GitHub webhook payload, the mental model ports directly. Every request arrives with three security headers: webhook-signature, webhook-id, and webhook-timestamp. The signature header uses HMAC for project-level webhooks and JWKS for per-request overrides. That distinction — HMAC for global project config, JWKS for dynamic per-job routing — matches how teams actually work: most jobs go to one endpoint, but multi-tenant setups and environments with different security domains need the flexibility to route specific requests to different receivers.

The delivery guarantee is "at-least-once" with automatic retries for up to 24 hours. That is standard practice for webhook systems, and it means Google is not inventing new semantics here — it is adopting the industry norm. If your receiver goes down for an hour, you recover the job when it comes back online. If it goes down for 25 hours, you lose it. That 24-hour bound is the one constraint the announcement does not dwell on, and it is worth dwelling on: if you're building a mission-critical system where missed notifications are genuinely costly, you still need a fallback. A polling endpoint for orphaned jobs, a dead-letter queue, some way to reconcile "what did I submit" against "what did I receive." Google is not offering a managed replay UI for missed webhooks, which means your receiver's availability is still your problem. The announcement removes the polling loop from the happy path. It does not eliminate the need for a recovery mechanism when things go wrong off the happy path.

The developer response has been focused on the architectural implications rather than the API mechanics. A Chinese developer community writeup — framed as "长任务 Agent 不该靠轮询硬等," or "long-running agent tasks shouldn't have to hard-poll" — identified the core design argument: this is infrastructure for async agentic systems, not just a convenience for API consumers. A Japanese developer blog post emphasized the correct pattern for webhook receivers: respond quickly with a 200 after receiving the notification, then process the actual work asynchronously in the background. If you process the heavy lifting inside the webhook handler before returning, and your processing takes longer than Google's retry window, you enter a retransmission loop where Google re-delivers the same notification because it hasn't seen your 200 yet. That is a subtle but real failure mode that trips up developers who haven't built webhook integrations before. The announcement's Python SDK example shows the right pattern — configure the webhook on the batch task, return control immediately, receive the notification when the job completes.

The practical shift for builders is this: the architecture for long-running agent workflows changes from submit job → poll repeatedly → process result to submit job → receive webhook notification → process result. That sounds incremental. In terms of system design, it is. In terms of code you have to write, maintain, and debug at 2 AM when something goes wrong, it is not. Polling loops have a specific failure mode in agentic systems: you submit a job, you poll every N seconds, you hit a rate limit or a transient error, your retry logic has a bug, and now your agent is stuck in a state that is invisible from outside. The agent cannot observe its own stuckness. The operator cannot observe it without adding instrumentation. Webhooks don't fix the retry logic problem — but they eliminate the polling loop as a required architectural component, which removes an entire category of scaffolding that was only there because there was no alternative.

What this means for specific workflows: if you're running Deep Research agents, you can now register an endpoint and receive a push notification when the research job completes, rather than maintaining a polling loop with exponential backoff. If you're building video generation pipelines, you can route completion notifications to the right downstream processor without keeping a job alive on your side. If you're processing batch API jobs at scale, you can distribute work to workers based on actual completion signals rather than estimated completion times. Each of those scenarios had a workable solution before. None of those solutions was as simple to reason about, as easy to debug, or as efficient as a push-based model. Now they are.

Google's choice to follow the Standard Webhooks specification is the detail that will age best. API platforms that invent their own webhook signing schemes create ongoing maintenance burden for every consumer: new libraries, new verification code, new documentation to read and re-read every time the scheme evolves. When Stripe changes how its webhooks work, there is a community library for every language that handles the migration. When GitHub updates its webhook payload format, the integrations hold because the community has invested in keeping them current. Google's adoption of an existing standard means that same community infrastructure will eventually exist for Gemini API webhooks, even if it doesn't exist today. That is a boring infrastructure decision. It is also the right one.

The webhook system is available now in the Gemini API, with documentation at ai.google.dev/gemini-api/docs/webhooks and an end-to-end quickstart in the Gemini API Cookbook. The Python SDK example in the announcement shows the configuration pattern clearly. For teams already running polling loops against the Gemini API, this is the week to migrate. For teams designing new agentic workflows, the polling pattern is now the legacy approach — and the new approach is documented, standards-based, and ready to use.

Sources: Google Developer Blog, Standard Webhooks specification, Gemini API Cookbook

Sign up for more like this.