context-mode v1.0.136 Fixes the Unsexy Failure Modes That Break Coding-Agent Context

context-mode v1.0.136 Fixes the Unsexy Failure Modes That Break Coding-Agent Context

The least glamorous release in a coding-agent stack is often the one you should read most carefully. Shiny launches tell you what a tool wants to become. Maintenance releases tell you where real users are bleeding.

context-mode v1.0.136 is that kind of release. It is not a grand repositioning. It fixes an idle-shutdown regression, makes macOS native-binary cache swaps atomic, routes Codex shell payloads through the same Bash policy path as Claude Bash, improves MCP bridge respawning, records more events, adds nudges, and makes output easier to collapse. That is not launch-post material. It is production-infrastructure material.

The project already had the attention. During the research window, the repo showed roughly 14,931 stars, 1,062 forks, seven open issues, and fresh pushes on May 17. The README claims context-mode can reduce raw tool output from 315 KB to 5.4 KB — a 98% reduction — while tracking file edits, git activity, tasks, errors, and user decisions in SQLite, then retrieving relevant state through FTS5 and BM25 after compaction. It supports 15 platforms and plugs into the messy world of Claude Code, Codex, Cursor, OpenCode, Kilo, editor MCP hosts, and friends.

Those numbers explain why people care. The v1.0.136 fixes explain why the category is hard.

The Codex policy bug is the headline hiding in the plumbing

The most important line item is the shell-policy fix. Claude Bash payloads use one shape: { command }. Codex exec_command payloads use another: { cmd }. The prior routing only read command, which meant Codex shell calls could bypass the security policy path intended for shell execution. v1.0.136 adds a getShellCommand() normalization path so both shapes go through policy.

This is exactly the kind of bug that appears when agent infrastructure expands from one favorite client to a real ecosystem. If your policy layer recognizes Claude but not Codex, you do not have a policy layer. You have a Claude-specific filter with a Codex-shaped hole. That distinction matters because teams are increasingly standardizing context tooling across multiple agents. The chosen agent may vary by repo, task, budget, IDE, or corporate preference. The security layer cannot depend on every client encoding tool calls the same way.

For practitioners, the lesson is direct: adapter parity is security work. When you add a new agent host, do not only test whether context injection works. Test command policy, file-write policy, logging, redaction, error handling, tool-call schemas, reconnect behavior, and compaction behavior. A context tool that saves 98% of tokens but silently bypasses shell policy for one client is not a productivity win. It is a footgun with a nicer prompt.

Idle shutdown broke trust, not just sessions

The second major fix is lifecycle-related. v1.0.132 introduced a global 15-minute idle self-shutdown. That sounds reasonable until you remember that Claude Code, Codex, and some editor MCP clients do not automatically respawn child MCP servers. After the timeout, every ctx_* call could return “MCP server has exited.” v1.0.136 makes idle timeout opt-in by default.

This is the unsexy part of agent infrastructure: host supervision behavior is not uniform. Some clients respawn subprocesses. Some do not. Some surface injected hook context. Some accept and log it but do not show it to the model. Some handle collapsible output well. Some turn a useful context layer into noise. MCP gives the ecosystem a protocol surface, not a guarantee that every host behaves like every other host.

That matters because trust in a context system is fragile. If the tool vanishes mid-task after idle time, the developer learns to stop relying on it. If compaction loses the one decision that prevented a bad refactor, the agent becomes worse than stateless because it pretends to remember. If policy routing differs by host, users cannot reason about risk. The deeper context-mode gets into the critical path — tracking decisions, compressing output, enforcing policy, nudging the model — the more it needs ordinary software-engineering discipline: version pinning, rollback, compatibility matrices, and tests that cover host-specific behavior.

Bigger windows are not a strategy

The popularity of context-mode is partly a reaction to a lazy answer: just wait for larger context windows. Larger windows help, but they do not solve the real problem. Raw tool output is often redundant, irrelevant, stale, or actively misleading. A model with a giant window can still drown in logs, forget why a decision was made after compaction, or carry forward bad context because nobody modeled state explicitly.

A better pattern is narrower output, indexed evidence, session memory with provenance, and policy-aware tool routing. context-mode’s architecture points in that direction: SQLite for local state, FTS5/BM25 for retrieval, compaction-aware session tracking, host integrations, and explicit policy paths for shell execution. That is more work than dumping everything into the prompt. It is also the only approach that scales once agents run long tasks across multiple tools and multiple days.

The macOS atomic cache-swap fix sits in the same category. Native better-sqlite3 binary cache updates now stage, codesign, and atomically rename instead of copying and signing the active binary in place. Again: not glamorous. But if your context layer depends on a local database and native binaries, update mechanics become reliability mechanics. The agent ecosystem loves to pretend it lives in model space. Users experience it as software that either starts, persists, and recovers correctly — or does not.

If you run coding agents in a team, treat context infrastructure like a runtime dependency, not a personal productivity plugin. Pin versions for shared workflows. Read release notes for policy changes. Keep a small compatibility matrix for your actual hosts: Claude Code, Codex, Cursor, OpenCode, Kilo, Gemini CLI, or whatever you use. Add smoke tests that exercise shell calls, file edits, compaction, reconnect, and retrieval. Decide what context should never be stored: secrets, customer data, temporary debug dumps, and private chat that does not belong in repo memory.

My take: v1.0.136 is a healthy release because it is fixing the boring failure modes that determine whether teams can trust a context layer. Token savings got everyone’s attention. Lifecycle correctness, host parity, and shell-policy normalization are what keep the tool from becoming another clever demo that breaks the moment it touches real work.

Sources: context-mode v1.0.136 release, context-mode repository, context-mode README, Cursor hook limitation thread