claude-code

GitHub Confirms May 1 Billing Transition: What Copilot's Token-Priced Future Actually Costs

Anatoliy Kolodkin

01 May 2026 • 4 min read

GitHub's May 1 billing announcement is not about transparency. It is about the end of a pricing fiction.

For the past two years, Copilot Pro at $10 a month gave you a certain number of "premium requests" against the best models, and developers built workflows around that mental model. The new model, which goes live June 1, replaces that fiction with math. Every model, every context window, every cache write now has a specific dollar cost measured in AI credits. The flat-rate era is over, and the actual numbers are now on the table.

The pricing table is where the story gets real. Claude Opus 4.7 on Copilot runs $5 per million input tokens, $0.50 per million cached input tokens, $6.25 per million cache writes, and $25 per million output tokens. A session with no context reuse — the baseline case for fresh autonomous agent work — costs $31.25 per million output tokens when you add cache writes. Sonnet 4.6 is roughly half that. Haiku 4.5 is roughly a fifth. The gap between models is not a rounding error; it is a different product tier, and the credit math makes that explicit in a way the old request-count model obscured.

The operational implication lands fast. Autonomous agent sessions — the kind that walk through a codebase, plan across files, run tools, and generate substantive output — routinely consume 50,000 to 200,000 tokens per session. One fully-utilized Opus session at typical cache-miss rates runs roughly $0.63 in AI credits. Run five of those a day, five days a week, and you are at roughly $945 per month in credits against a Pro+ plan that previously included 1,500 premium requests. The gap between "request" counting and "token" counting is not a rounding error; for agentic sessions, it is an order-of-magnitude shift in cost structure.

GitHub has paused new Pro, Pro+, and Student sign-ups. It has removed Opus 4.5 and 4.6 from Pro+, keeping only Opus 4.7. The billing preview tool rolling out in early May will let users see side-by-side current versus estimated credit costs based on April usage. None of these are cosmetic changes. They are structural responses to the reality that flat-rate frontier model access does not math out when customers run autonomous agents that burn context at full speed.

The most interesting competitive pressure this creates is between Copilot's Claude resale and Claude Code direct. GitHub is now reselling Claude Opus 4.7 at token-weighted rates that include a meaningful markup over Anthropic's own API pricing. For developers running heavy Opus sessions, Claude Code via the Anthropic API is probably cheaper than Copilot with Opus. Sonnet and Haiku just became the default for routine Copilot work. Opus 4.7 is now an occasional tool, not a daily driver, unless AI credit spending is not a constraint.

There is a subtler point buried in the pricing table: cache policy is now a first-class cost driver. A 1-hour prompt cache TTL on API key, Bedrock, Vertex, or Foundry deployments can cut cache rewrite costs roughly 50% — GitHub's table makes that math explicit. For developers who leave sessions idle, switch branches, or bounce between tasks, the new model rewards cache hygiene in a way the flat-rate model never asked you to think about.

The pause on new sign-ups is also a signal about inference capacity. Combined with reporting that Anthropic's annual run rate has grown from $14 billion to $30 billion in two months, the picture is of a company whose model inference capacity is the real limiting factor — not demand. GitHub's token pricing for Opus reflects that constraint. The credits system lets GitHub pass inference costs directly to users rather than absorbing them as a loss on flat-rate plans. That is an honest move, even if the math stings.

The community reaction in GitHub's discussion forums reflects genuine pain from developers who built workflows around Pro+ and are now doing the math on what those workflows actually cost. One user's calculation — fully-utilized 200k Opus context at typical cache-miss rates approaching $4,000 per month in credits — is an edge case, but it is not a fantasy. Power users who run multiple autonomous sessions per day are discovering that their "unlimited" usage had a hidden subsidy, and that subsidy is now gone.

The billing transition does offer a 30-day window before June 1, and the CSV usage report with aic_quantity and aic_gross_amount columns will let developers actually see what their April patterns cost under the new model. That data will be more convincing than any announcement. Developers who want to stay on Copilot will need to understand their token burn rate per session. Developers who find that rate unsustainable have a clear alternative: Sonnet or Haiku for routine work, and Claude Code direct for heavy Opus sessions.

The bottom line is simple: GitHub did not just change how it bills. It changed which models are economically viable for which use cases. Sonnet and Haiku just became the default for routine Copilot work. Opus 4.7 is now an occasional tool, not a daily driver, unless you are treating AI credit as play money. For developers who built workflows around Opus on Pro+, the next 30 days are a migration window, and the most rational destination is probably Claude Code direct.

Sources: GitHub Docs: Plans for GitHub Copilot, GitHub Copilot billing docs, GitHub Community Discussion #192963

Sign up for more like this.