agentic-coding

Copilot CLI 1.0.55-3 Turns Plugins, Hooks, and Skills Into Session Controls

Anatoliy Kolodkin

27 May 2026 • 5 min read

GitHub Copilot CLI 1.0.55-3 is a pre-release, which is usually code for “skip unless you enjoy changelog archaeology.” This one is worth reading because it shows where the terminal-agent fight is going: plugins, hooks, skills, remotes, policy, and token accounting are becoming session controls rather than background configuration.

The release adds hook progress streaming, per-session pluginDirectories in session.create and session.resume, direct deletion of remote sessions from the picker, marketplace plugin installs using owner/repo#ref, tmux progress integration, deterministic skill precedence, remote-control policy messaging, and reasoning-token summaries. That is a lot of plumbing. It is also the shape of enterprise agent adoption after the novelty phase ends.

Copilot already has distribution. The harder work now is making agent sessions explicit enough that teams can reason about them. Which plugins were loaded? Which skills won when names collided? Which hook is still running? Did organization policy block remote control, or did the user misconfigure something locally? How many reasoning tokens did this task burn? Those questions are not cosmetic. They are the difference between a coding assistant and a governed automation surface.

Per-session plugins are the right primitive

The most important change is pluginDirectories on session creation and resume. Global plugins are convenient, but they are a governance smell. A plugin appropriate for one repository, client, team, or experiment should not silently become part of every Copilot session on the machine. Per-session plugin directories let SDK clients shape the capability set for the work being done right now.

That matters because plugins are not themes. They can expose tools, change workflows, influence prompts, and alter what the agent can see or do. Treating them as ambient machine state makes sessions hard to reproduce. A developer says “Copilot fixed it yesterday,” but nobody can tell which plugin version, branch, or local directory shaped the behavior. Mounting plugin directories per session turns capability selection into part of the session contract.

The new marketplace syntax, owner/repo#ref, points in the same direction. Pinning a plugin to a ref is basic supply-chain hygiene. Without it, “install this plugin” can mean different code tomorrow. With it, teams can review, approve, and reproduce a specific version. Agent plugin ecosystems will need this discipline quickly, because the incentive structure is obvious: every tool vendor wants to become agent-addressable, and every clever developer will be tempted to install the thing that makes today’s task easier.

The practical policy should be familiar to anyone who has managed dependencies. Pin plugin refs for team workflows. Prefer project-local or explicitly mounted plugins over personal machine state. Record plugin directories in automation logs. Review plugins that can execute commands, reach networks, read secrets, or modify code. If that sounds like too much ceremony, remember that the plugin is being loaded into a tool whose job is to change your repository.

Skill precedence is where hidden behavior becomes visible

Copilot CLI also changes skill precedence so --plugin-dir skills outrank personal-home skills with the same name. The order is now project > plugin-dir > personal > custom. This sounds like housekeeping until you imagine two skills named release, one reviewed by the project and one living in a developer’s home directory from six months ago.

Precedence rules are policy rules. The runtime has to choose, and the choice should be predictable. Favoring project skills first is correct for collaborative codebases because the repository should define its own workflow. Favoring explicitly mounted plugin-dir skills over personal skills is also sensible because the session caller intentionally attached that capability. Personal defaults should not silently override the project’s safety rails.

This is an under-discussed part of the “agent skills” boom. Skills are often described as reusable instructions or workflows, but in practice they are supply-chain artifacts. They can encode assumptions, approval shortcuts, command patterns, test strategy, deployment steps, or project-specific knowledge. Duplicate names across layers are inevitable. If the runtime cannot explain which one won, reviewers cannot reason about the agent’s behavior.

Teams should respond by treating skills like code. Keep project skills reviewed and versioned. Avoid duplicate names unless the override is intentional. Document precedence in the repo. Consider linting skill names across project, plugin, and personal layers in serious environments. The goal is not to kill customization. It is to keep customization from becoming invisible authority.

Hooks need progress because silence creates bad decisions

Hook progress streaming is another deceptively useful change. Long-running hooks are common in real agent workflows: tests, builds, linters, code generators, security scanners, policy checks, dependency audits. If they run silently, developers assume the agent is stuck. Then they cancel the session, rerun the task, or work around the hook. Silence is not neutral; it trains people to distrust the automation.

Streaming hook status into the timeline makes validation part of the session artifact. A reviewer can see that a hook ran, where it spent time, and what failed. That matters for postmortems, especially when agent-produced changes move through review asynchronously. “The hook ran for 90 seconds and failed during integration tests” is actionable. “Copilot paused for a while” is folklore.

The tmux 3.6b pane progress integration is small but culturally smart. Terminal-native developers already use tmux to manage parallel work. An agent CLI that exposes progress through tmux is fitting into the existing operating environment instead of demanding a bespoke cockpit. The same category includes Wayland clipboard fixes, full-color shell inheritance, and extension subprocess compatibility with older CLI versions. Boring terminal fidelity is not a luxury. It is how tools graduate from demo windows to daily use.

Remote-session deletion from the picker and improved messaging when remote-controlled sessions are disabled by organization policy are similarly practical. Remote control is not merely a UX feature; it is a policy surface. If an organization disables it, the CLI should say that clearly. Otherwise developers waste time debugging local config or, worse, try to route around controls because the product failed to distinguish policy from breakage.

Reasoning tokens belong in the review conversation

Reasoning-token summaries for all users may be the most procurement-relevant line in the release. Tokens are often treated as billing trivia, but for agentic coding they are operational telemetry. Reasoning spend tells you something about task complexity, prompt quality, context hygiene, model fit, and runaway risk. A task that succeeds after burning a huge reasoning budget may still be a poor workflow if a simpler tool, smaller context, or deterministic script would have solved it faster.

Teams comparing Copilot CLI, Codex, Claude Code, Gemini CLI, Qwen Code, and OpenCode should track more than “did it finish.” Track reasoning tokens, wall time, tool calls, review burden, test outcomes, rollback rate, and human intervention count. The agent market is full of vibes because most teams have not instrumented their own usage. Copilot putting reasoning counts into normal summaries makes evidence easier to collect.

The immediate practitioner move is to treat Copilot CLI sessions as configured environments. Mount only the plugin directories a task needs. Pin marketplace plugin refs when the workflow matters. Keep project skills reviewed and avoid accidental name collisions. Use hook progress for long-running validation instead of hiding checks behind opaque shell commands. Check remote-control policy before building automation around remote sessions. Start logging reasoning tokens by task class before someone asks whether the expensive agent is actually cheaper than the human review it creates.

Copilot CLI 1.0.55-3 is not a grand announcement. It is more interesting than that. It is GitHub adding the knobs that serious teams eventually require: scoped capabilities, visible validation, deterministic precedence, policy-aware remotes, and cost accounting. The CLI wars are becoming runtime-governance wars. Copilot has the distribution advantage. Releases like this are about whether it can earn the operational trust that distribution alone does not buy.

Sources: GitHub Copilot CLI v1.0.55-3 release, GitHub Copilot CLI releases, GitHub Copilot CLI repository

Per-session plugins are the right primitive

Skill precedence is where hidden behavior becomes visible

Hooks need progress because silence creates bad decisions

Reasoning tokens belong in the review conversation

Sign up for more like this.