Gemini CLI’s May 21 Nightly Wires Agent Sessions Into Google’s Developer Stack

Gemini CLI’s May 21 Nightly Wires Agent Sessions Into Google’s Developer Stack

Gemini CLI’s May 21 nightly is not the loudest part of Google’s developer story this week. That honor goes to the post-I/O platform narrative: Gemini models, ADK, A2A, managed agents, and the usual promise that everything will become agent-shaped if you squint at the architecture diagram long enough.

But builders should pay attention to the nightly anyway, because this is where the platform either becomes usable or stays a keynote. The release wires local and remote agent sessions into the CLI, improves MCP and non-interactive behavior, adds snapshot recovery, respects NO_PROXY, and fixes a path traversal issue in custom command file injection. That is runtime plumbing. Runtime plumbing is where agent products become engineering tools.

The important word is AgentSession, not Gemini

The Gemini CLI README sells the obvious strengths: Gemini 3 models, a 1M-token context window, Google Search grounding, file operations, shell commands, web fetching, and MCP support. It also advertises a generous personal-account free tier of 60 requests/min and 1,000 requests/day. Those are attractive numbers, but they are not the story in this nightly.

The story is AgentSession. PR #26665 adds LocalSessionInvocation, wrapping LocalAgentExecutor behind AgentProtocol. PR #26937 adds RemoteSessionInvocation, wrapping A2A client streaming behind the same protocol and handling initialState with contextId and taskId so state can persist across invocations. PR #26948 then wires AgentSession invocations into AgentTool behind the adk.agentSessionSubagentEnabled feature flag.

That is a lot of implementation detail to say something simple: Google is trying to make local and remote subagent work look like one invocation model. That is the correct abstraction if Gemini CLI is supposed to sit inside a broader ADK/A2A ecosystem. Some agent work belongs on the developer’s machine, close to the repo and shell state. Some belongs in a remote session where policy, resources, or long-running execution live elsewhere. A common protocol gives teams a chance to reason about delegation without rewriting every tool call.

The risk is that the abstraction hides too much. Builders should watch what state crosses the local/remote boundary, how streaming failures appear to the user, whether permissions differ between local and remote sessions, and how audit trails represent subagent work. “It uses the same protocol” is useful only if the operational semantics are also visible.

Custom command file injection needed the security fix it just got

The sharpest change in the nightly is PR #27234, which prevents path traversal in custom command file injection via @{...} syntax. The fix validates resolved absolute and relative paths against workspace.isPathWithinWorkspace(), re-validates recursively discovered files, uses realpath resolution to handle symlink chains, and includes regression tests attempting to read /tmp/secret.txt from outside the workspace.

This is exactly the kind of bug every coding-agent team should assume exists until proven otherwise. Prompt syntax that can pull files into context is powerful because it reduces friction. It is dangerous for the same reason. If @{...} can be tricked into reading outside the repository, the agent now has a data-exfiltration primitive dressed up as convenience.

The right boundary for a coding agent is not “the user could have read the file manually.” The right boundary is “the agent should only read what the workspace policy permits, through mechanisms that can be audited.” Realpath checks against symlink chains are not paranoia. They are what separates an agent feature from a filesystem footgun.

Non-interactive mode is where demos go to become production

The release notes also include configured MCP servers allowed in non-interactive mode, nullable array handling in MCP tools, snapshot recovery across sessions, default policy loading for the A2A server, exception handling for file-storage migration, and NO_PROXY support in the global fetch dispatcher. These are not glamorous. They are the difference between a tool that works in a watched terminal and a tool that can run from scripts, CI, editors, cron, and ACP-style harnesses.

Configured MCP in non-interactive mode is especially important. Teams do not only invoke coding agents by typing into a TTY. They run them as background jobs, review bots, local automations, and editor integrations. If non-interactive mode does not see the intended MCP surface, automation results diverge from interactive results. That divergence is poison because nobody can reproduce the failure path. The human saw one tool environment; the job saw another.

Snapshot recovery points in the same direction. Long-running agent work has to survive interruption without pretending every restart is a clean chat. If a session can be resumed with state intact, the agent becomes closer to a process. If not, it remains a transcript with commitment issues.

The Windows and terminal fixes are smaller but still useful: prefer pwsh.exe over Windows PowerShell 5.1, avoid binary false-positives on Windows PTY streams, prevent unmapped Vim Normal mode keys from inserting text into prompt input, and add Sublime Text plus Emacs Client editor support. Agent CLIs live inside messy developer environments. Supporting the mess is product work.

How to evaluate this without buying the whole platform story

If you are already in the Google/Gemini lane, this nightly is worth testing in a disposable repository. Do not standardize on a nightly because the release notes look interesting. Instead, enable the agent-session feature flag, run local and remote subagent invocations, inspect how contextId and taskId persist, and deliberately break a remote stream to see how failure is represented. Then test MCP in non-interactive mode and compare the tool surface against an interactive session.

Also reproduce the traversal class yourself. Try custom-command patterns that point outside the workspace, including symlink paths, and verify the CLI refuses them. Behind a corporate proxy, confirm NO_PROXY behaves as expected. If you depend on A2A, check default policy loading rather than assuming the server starts with the policy you intended.

The editorial take is straightforward: Google’s agent stack will not be judged by the elegance of the I/O slides. It will be judged by whether the CLI can preserve session state, invoke local and remote agents predictably, expose the same MCP surface in automation, and keep workspace boundaries intact when prompt syntax gets clever. This nightly moves in that direction. The keynote promised the platform; this is the commit history where the platform has to earn it.

Sources: GitHub — Gemini CLI v0.44.0-nightly.20260521.g57c42a5c4, Gemini CLI repository, PR #26665, PR #26937, PR #27234