Claude Code's April 15 Release Fixes the Economics of Coming Back to Work

Claude Code's April 15 Release Fixes the Economics of Coming Back to Work

Claude Code's April 15 Release Fixes the Economics of Coming Back to Work

Claude Code's latest release is small on paper and quietly important in practice. The April 15 changes are mostly about prompt-cache control, session re-entry, and discoverability of built-in workflows — which sounds like housekeeping until you notice the surrounding user complaints about quota burn and stale-session cost. Anthropic is not just shipping features here. It is patching the economics of daily Claude Code usage.

If you have been using Claude Code seriously for more than a few weeks, you have probably noticed the problem. You start a session, work for an hour, step away for a meeting, come back, and Claude has lost the thread — or worse, it has not lost the thread but the cost of rebuilding it is much higher than it should be. The model did not forget anything. The context window is still full of history. But something about the way the session rehydrated ate more tokens than expected, and now your daily budget is meaningfully lower than it was an hour ago. This is the problem this release attacks.

The Cache TTL Controls Are a Real Cost Lever

Two new environment variables landed with this release. ENABLE_PROMPT_CACHING_1H opts into a one-hour prompt cache TTL for API key, Bedrock, Vertex, and Foundry users. FORCE_PROMPT_CACHING_5M forces a five-minute TTL. The older Bedrock-specific variable is deprecated but still honored. These are not cosmetic knobs. They are direct responses to a cost problem that power users have been reverse-engineering through proxy logs for months.

Here is the math that makes this real. On a 100-request Opus session with roughly 60,000 cached tokens per request, a 5-minute TTL could drive about $11.25 in cache rewrites, versus roughly $5.57 with a one-hour TTL. That is a 50 percent cost difference on the caching line item alone. On Bedrock specifically, cache reads run at $0.9375 per million tokens while rewrites run at $9.375 per million — a 10x difference that makes TTL policy a genuine product decision, not an implementation detail.

What Anthropic also fixed: users who set DISABLE_TELEMETRY were unintentionally falling back from one-hour to five-minute cache behavior because the telemetry setting was entangled with the cache TTL logic. That is a bug that quietly inflated costs for anyone who had disabled telemetry for privacy reasons. The fix is quiet, but it matters for the teams that thought they understood their Claude Code cost profile and did not.

/recap Is the Feature Users Were Already Building Workarounds For

The new /recap command is the standout addition. When you return to a dormant session, /recap provides context — a summary of where you were, what the model was working on, what decisions were made — without forcing a full context replay. It is configurable through /config and can be force-enabled with the CLAUDE_CODE_ENABLE_AWAY_SUMMARY environment variable for users who run with telemetry disabled.

This is an answer to a real workflow bug that Anthropic did not explicitly call out as a bug, which is probably intentional. Humans do not work in uninterrupted flows. They attend meetings, handle incidents, review PRs, and context-switch between tasks that have nothing to do with the code they were just writing. A tool that assumes continuous flow is optimizing for demo videos and single-task sprints, not for how engineering teams actually operate. /recap is Anthropic admitting that session return is a workflow worth designing for rather than a rare edge case to handle gracefully.

The feature emerged from a pattern Anthropic had apparently been tracking: users returning to sessions after coffee breaks, lunch, or overnight pauses were either losing context or paying to rebuild it. The fix is a summary layer that sits between the raw session history and the active context window. It is not magic — it is an operational design decision that says the session return experience is part of the product, not an implementation artifact.

Claude Can Now Find Its Own Built-in Commands

A smaller addition that deserves attention: Claude can now discover and invoke built-in slash commands like /init, /review, and /security-review through the Skill tool. This is not a user-facing feature in the traditional sense. It is an internal capability expansion that means Claude can now chain its own built-in workflows programmatically rather than relying on the user to know what is available and invoke it explicitly.

What this enables in practice: agents, skills, and plugins can now call Claude's built-in commands as tools rather than asking the human to do it. If you have a workflow that includes a review step, the agent can invoke /review as part of a larger automated process instead of stopping to ask the user to run it. That is a meaningful step toward agents that can own multi-step workflows end-to-end rather than handing off at each internal decision point.

The Community Signal Was in the Issue Tracker, Not the Release Notes

The Hacker News thread that accompanied this release included a comment from Boris on the Claude Code team that is more informative than most release notes. He laid out the concrete suspected causes behind surprise quota burn rather than waving at general capacity issues: cache misses on stale one-million-token sessions, hidden cost from heavy skill and plugin usage, and the five-minute versus one-hour TTL distinction being entangled with telemetry settings.

That level of specificity from a team member is unusual and worth noting. Most product teams do not publish their cost breakdown reasoning in comment threads. The fact that Boris did suggests either that Anthropic is unusually transparent about its internal diagnosis, or that the community pressure around quota burn had reached a level where the normal corporate communication instincts were overridden by the need to demonstrate that the company understood the problem.

The same thread surfaced a feature request that did not make this release but is worth watching: better user-facing analytics for which skills and workflows are actually burning usage. Practitioners want to see a breakdown of which skills, agents, or plugins drove the most cache misses and token consumption. That request has not been fulfilled yet, but the fact that the team acknowledged it as a legitimate ask is a signal that more visibility into session economics is on the roadmap.

The Deeper Pattern: Claude Code Is Maturing Into Infrastructure

Stepping back, what this release reveals is that Claude Code's release cadence is drifting toward operational maturity. The past weeks have delivered prompt-cache controls, /recap, remote-task features, routines, and desktop parallelism. None of that is flashy model theater. It is Anthropic optimizing the boring parts that determine whether a coding agent is delightful for demos or tolerable for eight-hour workdays.

The competitive frontier in coding agents is shifting. Raw model output quality still matters, but it is increasingly the price of entry rather than the differentiator. What separates the tools that developers adopt for real work from the ones they try for a week and abandon is operational behavior: how the tool handles context reuse, interruption recovery, session cost visibility, and the long tail of edge cases that accumulate once a tool escapes the happy path of simple, short tasks.

For builders evaluating coding agents, the lesson is practical: stop benchmarking only raw model output quality. Benchmark session return cost, cache behavior, long-context ergonomics, and whether the tool helps users recover from interruption without quietly torching quota. The model that writes the best code but burns your budget when you check Slack is less useful than the model that writes slightly worse code and respects your time and money.

Sources: GitHub Releases, Hacker News discussion, Claude Code overview docs