google-ai

Gemini CLI's Offline Ripgrep Push Is Google Finally Taking Local-First Coding Agents Seriously

Anatoliy Kolodkin

02 May 2026 • 6 min read

Gemini CLI's Offline Ripgrep Push Is Google Finally Taking Local-First Coding Agents Seriously

There is a version of the Gemini CLI story that reads as a routine patch release with some nightly-build noise attached. That is the version that surfaces in your feed, gets a two-paragraph summary, and disappears. It is also the wrong version.

The right version starts with a specific engineering problem that has quietly defined the ceiling for CLI-based coding agents in enterprise environments: they require network access to do the things developers need most. Code search, navigation, and retrieval are fundamentally local operations — they run against code that lives on your machine. When the agent needs to phone home for every grep, that constraint shapes what the tool can realistically be used for. It becomes a pair programming partner for drafting and discussion, but not a real development environment you can drive from a terminal without an open network pipe.

Gemini CLI v0.40.0, released in late April and partially cherry-picked into v0.40.1 on April 30, is Google's most direct acknowledgment that this constraint is a problem worth solving — and that solving it is a competitive necessity, not just a nice-to-have feature. The changelog item is deceptively small: "bundle ripgrep binaries into SEA for offline support." What it means is that Gemini CLI now ships ripgrep inside its self-contained executable archive, enabling offline code search without reaching out to any external service. That is the difference between a coding agent that works on an airplane and one that stalls the moment you step outside the corporate VPN.

For practitioners who have been watching the CLI coding agent space — specifically the emerging competition between Gemini CLI and tools like Claude Code — this addition is the most tangible signal yet that Google is playing for keeps in the local development environment. Claude Code's ability to run without mandatory cloud round-trips has been one of its quietest competitive advantages. Enterprises with air-gapped developer machines, contractors on restricted networks, and security policies that prohibit external API calls from build environments have had a genuine reason to look elsewhere or make exceptions. Bundled ripgrep is Google's first concrete move to close that gap. Whether it fully matches Claude Code's offline architecture is still an open question — but the direction is clear and the pace is fast.

What v0.40.0 Actually Ships

The v0.40.0 nightly is a more substantial release than its patch successor suggests. The cherry-pick in v0.40.1 addressed a specific regression — a bug causing Gemini 3.1 Pro to hang indefinitely on thinking tasks after upgrading — but the full v0.40.0 changelog spans roughly 20 merged pull requests, several of which deserve attention on their own merits.

The ripgrep bundling is the headline, but it is not alone. The release also includes "fix(core): prevent YOLO mode from being downgraded" — a safety-related change that ensures permissive execution flags cannot be silently overridden by other configuration layers. For teams running Gemini CLI in CI or automated contexts, this is the kind of change that prevents subtle, hard-to-debug permission regressions. The fact that it exists at all also suggests that Google's internal users and early adopters have been running the CLI in sufficiently varied environments that the team has encountered cases where execution mode state was being inadvertently reset.

Also worth noting: "feat(core): integrate skill-creator into skill extraction agent." This is infrastructure for a workflow that Google has been building toward across several releases — the ability for Gemini CLI to not just execute tasks but to extract patterns from how it executes them and scaffold those into reusable skills. If that sounds abstract, it is because the user-facing implications are still being defined. But the direction is coherent: Gemini CLI is not just a prompt-and-response interface. It is becoming a system that learns from its own operation and exports that learning into structured, reusable components.

The MCP resource tooling additions — "feat(core): add tools to list and read MCP resources" — signal a similar architectural ambition. MCP, the Model Context Protocol, is increasingly the standard interface layer for connecting language models to external tools and data sources. Making MCP resource browsing a first-class CLI feature means Gemini CLI is positioning itself as a coordination surface for heterogeneous tool ecosystems, not just a standalone agent.

The Subagent Delegation Tests Are the Part Nobody Is Talking About

The most forward-looking item in the v0.40.0 changelog is not user-facing at all: "test(evals): add subagent delegation evaluation tests." This is a test suite for formal evaluation of one agent handing work off to another agent within a single session. It is not a shipped feature. It is not documented in release notes that end users will read. But it is arguably the most important signal in the entire release.

What it tells you is that Google is building internal evaluation infrastructure for multi-agent coordination inside Gemini CLI. That is a prerequisite for actually shipping multi-agent orchestration as a first-class product capability. You cannot reliably ship a feature that lets one agent delegate subtasks to specialized subagents unless you have a way to measure whether those delegations are working correctly — whether the right context is being passed, whether the subagent is producing output that the parent can use, whether the handoff is reliable across different task types.

The practical implication for practitioners is that the roadmap is almost certainly heading toward Gemini CLI sessions that coordinate multiple specialized agents working on different parts of the same codebase simultaneously. A planning agent that breaks down a refactoring task, a writing agent that implements the changes, a testing agent that validates the output — all operating under a single CLI session with shared context. That is a meaningfully different product than what exists today. The fact that Google is investing in evaluation infrastructure for this pattern, rather than just shipping features and hoping they work, suggests the team is taking the reliability requirement seriously. Multi-agent systems that are not rigorously evaluated tend to fail in ways that are expensive to debug in production.

The Regression Is a Reminder

No discussion of v0.40.1 is complete without acknowledging the regression it introduced: upgrading from v0.40.0 to v0.40.1 causes Gemini 3.1 Pro to hang indefinitely on thinking tasks. The bug was reported on May 1 and is actively triaged, which is the expected response from a team maintaining a fast-moving nightly-driven release. The workaround for teams hitting the hang is to pin to v0.40.0 or use the nightly channel until the patch stabilizes.

For teams evaluating Gemini CLI for production CI use, this is a useful data point about the release model. The CLI is still releasing through nightly-driven channels, which means the stable tag can accumulate regressions that the nightly channel catches first. Production deployments should have a pinning strategy for tool versions and a process for evaluating nightly builds before promoting them to critical paths. This is not unique to Gemini CLI — it is true of any tool in active development with a fast release cadence. But it is worth stating explicitly because the alternative is discovering the hard way that your CI pipeline hung on a Tuesday morning because someone upgraded a minor version.

Why This Matters for the I/O Roadmap

Google I/O is on May 19. Gemini CLI has since released through v0.42.0-nightly as of April 28, which means the release cadence is rapid and the team is accumulating features at speed. The offline ripgrep capability, the subagent delegation evaluation infrastructure, the MCP resource tooling — these are not random commits. They paint a coherent picture of a tool that is becoming a serious local development environment rather than a cloud-dependent chat interface with terminal output.

The competitive context matters here. Claude Code has established that a local-first, offline-capable CLI coding agent is a viable and valuable product category. Google is now racing to match that position with a tool that also has native access to Gemini models, Google Cloud integrations, and the broader Google AI ecosystem. Whether Gemini CLI ends up being the default choice for developers inside Google-aligned environments or whether it carves out a different position in the market is a question that the next few releases — and the I/O announcements — will start to answer.

For practitioners, the immediate takeaway is concrete: if you have been waiting for Gemini CLI to be stable enough for serious local development work, the offline ripgrep addition in v0.40.0 is a meaningful step in that direction. The regression in v0.40.1 is a temporary drag, but the trajectory is clear. The subagent evaluation infrastructure is the longer-range signal worth watching — it suggests the team is building toward something more capable than a solo coding session tool. That something, whatever it ends up being, will arrive faster than the release cadence makes it look.

Sources: GitHub — Gemini CLI v0.40.1 Release, GitHub — Gemini CLI v0.40.0

What v0.40.0 Actually Ships

The Subagent Delegation Tests Are the Part Nobody Is Talking About

The Regression Is a Reminder

Why This Matters for the I/O Roadmap

Sign up for more like this.