Claude Code’s Local Skills Turn Agent Plugins Into Runtime Policy
Claude Code’s latest release is not the kind that wins a keynote slide. That is precisely why it matters.
Version v2.1.157, published May 29, 2026, lands in the shadow of Anthropic’s bigger Opus 4.8 and dynamic-workflows push. The headline release promised large tasks, background subagents, and codebase-scale orchestration. This follow-up is the operational layer underneath that pitch: local skills that auto-load from .claude/skills, plugin scaffolding with claude plugin init, agent selection honored from settings.json, worktree switching during a session, and more detailed OpenTelemetry events for tool decisions.
That sounds like plumbing. In agentic coding, plumbing is the product.
The release notes say plugins in .claude/skills directories now load automatically without requiring a marketplace. Developers can scaffold a local plugin directly into that path with claude plugin init <name>. The /plugin command also gets autocomplete for subcommands, installed plugin names, and known marketplace plugins. Claude-dispatched agents now honor the agent field in settings.json, while --agent <name> remains the override. Claude-managed worktrees can be switched mid-session with EnterWorktree, and managed worktrees are left unlocked after agent completion so normal git worktree remove and git worktree prune cleanup can work.
The most important line is the least glamorous: tool_decision telemetry events can include tool_parameters — including bash commands and MCP or skill names — when OTEL_LOG_TOOL_DETAILS=1 is enabled.
Local skills are policy, not decoration
The obvious read is that Claude Code just made plugins easier. The more useful read is that Claude Code is turning project-local skills into a runtime policy surface.
A skill is not merely a nicer README. It can shape how the agent interprets the project, which workflows it reaches for, which tools it calls, and what it treats as normal. Once skills auto-load from a repository path, the repo is no longer just code plus docs. It becomes code, docs, and agent behavior. That is powerful for teams that want consistent agent behavior across a codebase. It is also a new supply-chain surface hiding in plain sight.
Teams already review CI workflows because YAML can deploy production, leak secrets, or delete data with a straight face. Agent skills deserve similar treatment. They are friendlier to read, but the blast radius can be comparable if they steer an agent toward sensitive files, broad shell access, credential-bearing MCP servers, or destructive maintenance commands.
The right response is not to ban local skills. It is to treat them like code. Keep project skills in version control. Review changes. Separate personal experiments from shared repo behavior. Prefer narrow instructions over broad “be autonomous” vibes. If a skill can influence shell commands, MCP access, or deployment-adjacent work, it belongs in the same review path as build scripts and infrastructure automation.
The audit log finally gets closer to the thing auditors ask for
The OpenTelemetry change is the enterprise signal. An audit stream that says “the agent made a tool decision” is mostly theater. An audit stream that can say “the agent requested this bash command” or “this MCP server and skill were involved” is something a platform team can actually investigate.
There is a catch: OTEL_LOG_TOOL_DETAILS=1 may put sensitive operational details into logs. Bash parameters can include file paths, branch names, ticket IDs, internal hostnames, or accidentally copied secrets. MCP and skill names can expose internal systems or workflow design. This is not telemetry to casually dump into a low-retention, broadly readable logging bucket.
But for teams serious about agentic coding, the capability is necessary. You cannot govern what you cannot reconstruct. If a background agent edits the wrong file, calls the wrong MCP server, or asks for a suspicious command, the review trail needs enough context to distinguish a model mistake from a prompt-injection attempt, a bad skill, a misconfigured agent profile, or an operator approval error.
That is the bar coding agents are moving toward: not “did the model seem helpful?” but “can we replay what the runtime allowed, denied, and executed?” Claude Code is not all the way there, but including tool parameters in tool_decision events is a real step toward useful runtime observability.
Background agents are process supervisors now
The background-agent fixes in this release read like a checklist from a system that has stopped being a chat UI and started being a process supervisor. Completed sessions sometimes did not retire when an idle subagent was parked or had leaked a background shell. Background worktrees under .claude/worktrees/ could be orphaned after the 30-day job retention sweep. --resume did not report subagents that were running when the previous process exited. Reattached background sessions after sleep and wake could tell the model the wrong date.
None of those bugs are sexy. All of them matter.
Dynamic workflows and background agents create parallel branches of work. Worktrees are the filesystem boundary that keeps those branches from trampling one another. If worktrees leak, session state gets stale, or a resumed process forgets which subagents are still alive, the agent stops being a collaborator and becomes a distributed systems incident with a friendly prompt.
The fix allowing Claude-managed worktrees to be switched mid-session is useful, but it also raises the operational bar. The runtime has to keep branch identity, working directory, cleanup semantics, and user intent aligned. Engineers should treat those details as part of agent correctness. A coding agent that writes the right patch in the wrong worktree has still failed.
The release also includes smaller safety and usability repairs: zero-byte or corrupt image attachments become text placeholders instead of crashing the request; sandbox network prompts no longer appear incorrectly in auto or bypass mode; Stop in the IDE now stops a background subagent; and a config setting prevents the word “workflow” from accidentally triggering dynamic workflow behavior. That last one is almost funny until it costs money. In an agent runtime, accidental invocation is a production bug.
Auto mode in the cloud makes model choice a governance problem
The May 30 v2.1.158 follow-up extends Auto mode to Bedrock, Vertex, and Foundry for Opus 4.7 and 4.8 behind CLAUDE_CODE_ENABLE_AUTO_MODE=1. That matters because many enterprise teams route models through cloud provider surfaces for procurement, logging, residency, policy enforcement, or quota control.
Auto mode can be useful if it picks the right model for a job without forcing developers to micromanage every session. It can also blur the exact decision teams need to govern: which model handled which work, under what cost assumptions, with what data boundary. Model routing is no longer just a UX preference. It is part of the runtime contract.
Before enabling Auto mode broadly, teams should test whether it preserves their expectations around cost, latency, provider routing, and data handling. A background workflow that silently escalates model choice may be acceptable for a high-value migration. It is probably not acceptable for every lint cleanup, docs edit, or speculative refactor.
The practical upgrade path is conservative. Enable detailed tool telemetry only where logs are protected and access-controlled. Move shared skills into reviewed project paths, and keep personal skills out of team repositories. Define the allowed agent profiles in settings.json and document when --agent overrides are appropriate. Add worktree cleanup checks to your agent runbooks. If your team uses dynamic workflows, disable accidental keyword triggers and require explicit invocation for expensive orchestration.
Claude Code’s local skills and tool telemetry do not make agents magically safe. They make agent behavior more explicit, more local to the project, and more observable when configured correctly. That is the right direction. The next phase of agentic coding will not be won by the tool with the flashiest demo. It will be won by the runtime that lets teams answer boring questions under pressure: what loaded, what ran, who approved it, which worktree changed, what did it cost, and how do we cleanly unwind it?
LGTM, with a review comment: local agent policy is now part of the codebase. Treat it that way before it treats production like a scratchpad.
Sources: GitHub — Claude Code v2.1.157, GitHub — Claude Code v2.1.158, Claude Code v2.1.154 release notes, Anthropic — Introducing Claude Opus 4.8