ai-frameworks

Repo Forensics 2.9.0 Treats Agent Plugin Repos Like Supply-Chain Attack Surfaces, Not Helpful ZIP Files

Anatoliy Kolodkin

24 May 2026 • 4 min read

Agent plugin security is starting to look less like an AI problem and more like the JavaScript package ecosystem having a flashback. Repo Forensics 2.9.0 is a small open-source release by raw popularity numbers, but it lands on a real fault line: developers are wiring Claude skills, Codex plugins, OpenClaw extensions, MCP servers, and random GitHub repos directly into agent runtimes that can read code, call tools, touch credentials, and modify projects. That is not a harmless convenience layer. That is a supply-chain surface with a chatbot attached.

The new release adds “Scanner #20: Entrypoint Payload Injection Detection,” focused on malicious code that executes at require() or import time rather than through the obvious install hooks most teams know to watch. That distinction matters. A poisoned package does not need a loud postinstall script if the agent framework, MCP loader, or plugin registry imports it during startup. By then the package is already inside the runtime’s blast radius.

Import time is the new install script

Repo Forensics says the JavaScript side of the scanner looks for CommonJS injection patterns including appended IIFEs, high-entropy blocks, and suspicious module.exports reassignment. The Python side uses AST-based top-level-scope analysis for dangerous import-time behavior in files such as __init__.py and setup.py, explicitly calling out a durabletask-style pattern. The implementation detail is the point: this is not scanning for the string “malware.” It is looking for execution shape.

That is exactly where agent extension ecosystems are vulnerable. Traditional dependency risk is already bad; agent dependency risk is worse because the dependency is often installed for the purpose of being called by automation. A normal library may sit behind application logic. An MCP server or agent plugin is intentionally placed on the tool path. If it runs code at import time, it can execute before the human sees the first tool approval, before the model plans the first step, and before the team’s nice governance document becomes relevant.

The release also adds Megalodon CI detection for base64 decode-and-execute patterns in GitHub Actions workflows, plus five IOC campaign groups: a May 2026 node-ipc credential stealer spanning three version-pinned packages, Shai-Hulud copycats, the Nx Console VS Code compromise, an @AntV worm propagation campaign described as affecting more than 320 packages and 59 million monthly downloads, and CanisterWorm ICP blockchain C2 from April 2026. Some of those names will age; the category will not. Package ecosystems keep teaching the same lesson because every new ecosystem believes it is too special to inherit the old failure modes.

False positives are a product problem, not an implementation footnote

The more encouraging part of 2.9.0 is not just that it scans for entrypoint payloads. It ships the scanner with 32 tests and explicit false-positive resistance for webpack, rollup, and esbuild bundles. That matters because lazy static analysis is worse than no static analysis in one specific way: it trains developers to ignore or disable the tool.

Minified production JavaScript is ugly on purpose. Bundled code can contain dense functions, odd wrappers, and generated syntax that resembles obfuscation to a simplistic scanner. If an agent-security tool flags every bundle as malware, teams will either bypass it or quarantine it into a “security theater” lane nobody reads. A useful scanner has to distinguish “compiled by a bundler” from “crafted to steal tokens.” Repo Forensics is at least aiming at that boundary rather than pretending confidence is free.

The release includes other hardening details that read boring until you imagine an attacker reading the scanner source first: capping IIFE scan regions to 10KB, handling newline-before-brace regex evasion, expanding base64 redirect-then-exec detection, broadening suspicious argument detection, and catching MemoryError in ast.parse. Security tools are also software. They have parser edges, denial-of-service risks, evasion patterns, and user-experience failure modes. If a scanner cannot survive hostile input, it is not a scanner; it is a demo.

The agent extension store is already here, even if nobody admits it

The project README positions Repo Forensics as “npm audit for AI-agent plugins, skills, and MCP servers,” local-only, zero-dependency, zero-telemetry, and offline. It claims 20 scanners, more than 800 patterns, 1,306 tests, 41 correlation rules, and more than 140 package IOCs. The repository is still modest — 76 stars and 13 forks at research time — so this is not a market-victory story. It is a category story.

The category is obvious if you look at how developers actually work now. They install an MCP server because a README says it unlocks a service. They add a skill because a coding agent promises better reviews. They clone a plugin because a Discord thread says it fixes a workflow. Then that code gets connected to local repos, API keys, shell commands, browser sessions, and increasingly permissive approval habits. Browser extensions trained users to click “allow.” Agent extensions are training developers to click “continue.” That is not progress.

Repo Forensics also ships Codex and OpenClaw plugin manifests and supports auto-scan hooks for Claude Code and Codex CLI, with OpenClaw setup through PreToolUse, PostToolUse, and SessionStart checks. That hook model is more important than the branding. The scan needs to happen before npm install, after git clone, after git pull, and when a session starts with changed tools. A quarterly security review will not catch the malicious MCP server a developer installed at 11:47 p.m. because it solved a calendar integration problem.

For practitioners, the action list is concrete. Treat agent-facing code as privileged automation, not as a normal dev dependency. Version-pin MCP servers, plugins, and skills. Scan before installation and again after updates. Block GitHub Actions that decode and execute opaque payloads unless a human can justify them. Maintain an internal allowlist for extensions that touch source code, credentials, shells, browsers, or deployment systems. Document what each extension can access, not just what it claims to do.

The other practical move is cultural: stop reviewing prompt injection in isolation. Prompt injection is real, but it is only one way agent systems fail. A malicious plugin does not need to trick the model if it can run at import time. An MCP server does not need to jailbreak the assistant if it can exfiltrate through its own implementation. A skill bundle does not need code execution if it can quietly rewrite the agent’s behavioral priorities. Supply-chain review and prompt-safety review belong in the same policy, because attackers will not respect your org chart.

My take: Repo Forensics 2.9.0 is not important because this one tool has already won. It is important because the agent ecosystem is finally discovering that “extension marketplace” is a security boundary. AI coding agents do not need more vibes. They need dependency hygiene that JavaScript learned the painful way, preferably before the worm writes the postmortem.

Sources: Repo Forensics v2.9.0 release, Repo Forensics repository, Aguara v0.18.3 release

Import time is the new install script

False positives are a product problem, not an implementation footnote

The agent extension store is already here, even if nobody admits it

Sign up for more like this.