claude-code

Six Patches, Zero Structural Fixes: The AI Coding Agent Security Problem Nobody Is Solving

Anatoliy Kolodkin

30 Apr 2026 • 6 min read

Here's what nine months of credential heists through AI coding agents actually tells us: the security community found the bugs. The vendors patched them. Nothing fundamental changed.

VentureBeat published a comprehensive breakdown today cataloging six credential-theft exploits targeting Codex, Claude Code, Copilot, and Vertex AI — attacks spanning from mid-2025 through this month. The specificity is useful: this is not speculation about future risks. These are documented incidents with CVE numbers, patch dates, and attack chains you can go read. What makes the piece worth sitting with is the pattern underneath all six of them, and what that pattern says about where we are with AI agent security as a practice.

The exploit family tree

Start with the mechanics, because they matter for understanding the scope of the problem.

OpenAI Codex got caught cloning repos by embedding a GitHub OAuth token directly in the git remote URL. A semicolon and backtick in a branch name turned that parameter into a subshell exfiltration vector — the branch name ran curl before the token was scrubbed. OpenAI patched it in February 2026. The stealth layer is the detail that makes this worth remembering: 94 ideographic space characters (Unicode U+3000) appended to the malicious branch name made it look visually identical to "main" in the Codex web portal. That's not a zero-day in the model. That's a UI spoofing attack that works because the web interface wasn't checking the actual branch name encoding.

Claude Code's file-write sandbox had two distinct problems worth separating. CVE-2026-25723: piped sed and echo commands escaped the project sandbox because command chaining wasn't validated — the fix was structural, not a prompt tweak. CVE-2026-33068 was subtler: Claude Code resolved permission modes from .claude/settings.json before showing the workspace trust dialog. A malicious repo could set permissions.defaultMode to bypassPermissions, and the trust prompt never appeared. You never got asked. The settings file won by default.

Then there's the one that's getting the most friction in developer channels: Adversa AI found that Claude Code silently dropped deny-rule enforcement once a command exceeded 50 subcommands. Anthropic's engineers had traded security for speed and stopped checking after the fiftieth. That's not a model alignment problem. That's a performance optimization that someone made at the security boundary and never reconsidered. Patched in v2.1.90, but the pattern — a security check disabled because it was expensive — shows up in agentic systems more than the CVE list alone makes visible.

GitHub Copilot had its own chain: hidden instructions in a PR description flipped auto-approve mode in .vscode/settings.json, disabling all confirmations. No shell commands required. No malicious binary. Just a settings file write triggered by natural-language instructions the model generated from untrusted content. And Persistent Security's co-discovery found that a GitHub issue containing a crafted JSON $schema URL could exfiltrate the GITHUB_TOKEN through the schema resolution mechanism itself, enabling full repository takeover. Open the issue. That's it.

Vertex AI's issue is architecturally different and worth isolating. Unit 42 found that Vertex AI's default Google service identity — called a P4SA — had excessive OAuth scopes by design. Stolen P4SA credentials granted unrestricted read access to every Cloud Storage bucket in the project and reached restricted Google-owned Artifact Registry repositories at the core of Vertex AI Reasoning Engine. The researchers described the P4SA as functioning like a "double agent" — it had access to both user data and Google's own infrastructure. That's not an implementation bug. That's a scope design decision that created a privilege escalation path as a feature.

The structural failure, named

Merritt Baer, former Deputy CISO at AWS and CSO at Enkrypt AI, gave VentureBeat the most useful quote in the piece: "Enterprises believe they've 'approved' AI vendors, but what they've actually approved is an interface, not the underlying system."

That is the structural failure every one of these six exploits exploited. When you authorized Claude Code or Codex to access your GitHub account, you authorized the happy path. What you did not authorize — because it was never made explicit — was the agent running in an environment where a malicious repository can set its own permission rules, where a branch name can become a shell command, or where a GitHub issue can rewrite your .vscode/settings.json without any visible interaction. These are not model vulnerabilities. They are credential-to-runtime mismatch problems that exist because the agent threat model was still being written as the exploits arrived.

The 50-subcommand bypass deserves extra attention precisely because it illustrates a failure mode that appears repeatedly in agentic systems: a security check converted to a performance shortcut and never revisited. The deny-rule enforcement in Claude Code was not broken by accident. It was deliberately disabled past a threshold because checking every subcommand was expensive. That is a completely rational engineering decision at the time it was made, and it is exactly the kind of decision that ages badly as agent capabilities expand. You optimize for the common case. The adversary optimizes for the edge case you deprioritized. The two paths meet eventually.

Kayne McGladrey, IEEE Senior Member, put the permission problem in practical terms: agents use "far more permissions than they should have, more than a human would, because of the speed of scale and intent." That is the crux. A human developer who wants to exfiltrate credentials has to actively choose to do something wrong. An agent that holds broad OAuth scopes will happily execute the instructions embedded in a malicious PR description because that's what it was built to do. The attack surface is not the model's values. It's the gap between what the credential allows and what the agent will do with it when the environment provides unexpected instructions.

What practitioners should actually do

The VentureBeat defense grid maps each exploit to the control that failed and the gap that allowed it. The bottom row is the most important line in the article: "Inventory and govern agent identities — No major AI coding agent vendor ships agent identity discovery or lifecycle management." That is the structural gap. Not this CVE or that patch. The entire category of controls that would let an enterprise treat an AI agent's credentials the way it treats a human privileged user's credentials — with discovery, rotation, least-privilege scoping, and separation of duties — does not exist as a shipped product from any major vendor.

The practical actions for engineering and security teams are concrete. Audit every OAuth scope granted to every AI coding agent the same way you audit service accounts — and that audit will probably be uncomfortable, because most deployments started with broader scopes than anyone would approve today. Monitor for the specific techniques these exploits used: Unicode obfuscation (U+3000 and similar ideographic spaces in branch names), command chaining above 50 subcommands, and any changes to .vscode/settings.json or .claude/settings.json that flip permission modes. Ask your vendors in writing — before the next renewal — to show you their identity lifecycle management controls for the agent running in your environment. If the answer is "we don't have those," that is your audit finding.

The harder question is the one CrowdStrike CTO Elia Zaitsev raised at RSAC 2026: whether the entire framework of "authorize an agent to act on your behalf" is compatible with least-privilege principles when the agent's actions are determined by untrusted inputs it encounters at runtime. The answer from nine months of exploits is probably not, and the fix is not a vendor responsibility alone. It requires enterprises to govern AI agent identities the same way they govern human privileged identities: credential rotation, least-privilege scoping, and separation of duties between the agent that writes code and the agent that deploys it.

Mike Riemer, CTO at Ivanti, added the operational constraint that makes this urgent: "Threat actors are reverse engineering patches within 72 hours." For traditional software, that is a manageable window. For an agent that processes every repository it touches, that window compresses to seconds — the agent encounters the malicious input before the patch is deployed, and the credential the agent holds is exactly what the attacker needed.

The platform implication

One thing the article doesn't dwell on but that matters for how this beat is evolving: the VentureBeat piece treats these as separate vendor problems. In practice, every major AI coding agent vendor is dealing with the same underlying issue — the agent holds credentials, the environment is untrusted, and the security boundary was designed for a world where the actor and the environment were the same person. That's not a Claude Code problem or a Codex problem. It's an architectural problem for the entire category, and the vendors that acknowledge it first and build identity governance primitives into the runtime — not just patch individual exploits — will be the ones who earn enterprise trust as this market matures.

Six patches. Zero structural fixes. The story is not that AI coding agents are insecure. It's that the identity governance framework we use for human privileged access hasn't been built for agents yet, and the nine-month exploit chain is what happens when you run a category of software with human-scale trust assumptions in an environment that is increasingly adversarial.

Sources: VentureBeat, BeyondTrust, Adversa AI, SentinelOne CVE-2026-25723, SentinelOne CVE-2026-33068, Unit 42

The exploit family tree

The structural failure, named

What practitioners should actually do

The platform implication

Sign up for more like this.