azure-ai

GitHub Copilot's New Metered Billing Is Microsoft's First Honest Admission That Unlimited AI Coding Was Always Borrowed Time

Anatoliy Kolodkin

28 Apr 2026 • 3 min read

GitHub Copilot has a pricing problem it finally decided to stop pretending does not exist.

Starting June 1, GitHub is moving every Copilot plan from request-based billing to usage-based billing, replacing flat monthly quotas with a token-metered system called GitHub AI Credits. Under the old model, a two-minute autocomplete snippet and a two-hour autonomous refactoring session consumed the same request quota. That was fine when AI coding was a novelty. It stopped being fine once professional developers started running Copilot on production codebases for hours at a stretch. GitHub's own CPO Mario Rodriguez put it plainly: under request-based billing, the system was charging users the same amount regardless of how much compute their session actually consumed.

The Register's framing of this was the most honest: GitHub is finally admitting that selling AI coding assistance as an unlimited buffet — like Red Lobster's Endless Shrimp — was never sustainable. The metered model converts Copilot from a predictable subscription into a variable cloud-style bill, where complex autonomous coding sessions now cost GitHub significantly more to deliver than simple autocomplete. That is not a scandal. It is how infrastructure economics work. The question for practitioners is what changes now, and what actually stays the same.

Here is what changes. GitHub AI Credits become the currency unit at $0.01 per credit, with pricing based on input tokens, output tokens, and cached tokens at different rates per model. Monthly allotments are tiered: Copilot Pro gets 1,000 credits per month, Copilot Pro+ gets 3,900, Copilot Business gets 1,900 per user per month, and Copilot Enterprise gets 3,900 per user per month. Annual subscription customers face the sharpest sticker shock on premium models: Anthropic Opus 4.7 jumps from a 7.5x multiplier to 27x, and GPT-5.4 goes from 1x to 6x. Those are not marginal adjustments. For teams running Opus 4.7 heavily in agentic workflows, the annual plan just got substantially more expensive on a per-credit basis.

Here is what does not change, at least not yet: code completions and Next Edit Suggestions remain unlimited on paid plans after your monthly credit allotment is exhausted. Those are the bread-and-butter autocomplete features that most developers use most of the time. If your Copilot usage looks like standard intellisense-style suggestions and occasional chat questions, you may notice very little practical difference. The metered model primarily hits expensive agentic workloads — long autonomous sessions, complex multi-file refactoring, heavy use of premium models — which is where the cost mismatch between flat billing and actual compute was most egregious.

The more important shift for engineering managers is the mental model change. Under token-metered billing, AI coding costs become a variable line item instead of a fixed subscription cost. That requires a different kind of planning: budget dashboards, usage alerts, per-model cost attribution, and credit cap management that flat-rate plans never demanded. GitHub is introducing a preview bill experience in early May so users can see projected costs before the June 1 transition. That is the right first step, but the real work for adopters is modeling their actual consumption patterns now, before the billing model changes, so there are no surprises in June.

The broader context is that GitHub is following a path other AI API providers have already walked. Anthropic, Google, and OpenAI have all moved toward similar limits on unlimited usage in recent months. The "AI coding is essentially free" era was always subsidized — by provider margins, by investor patience, and by the fact that early adopters were doing less volume than they thought. The graduation to metered billing is not evidence that AI coding failed. It is evidence that it graduated from novelty to real infrastructure, which means it gets treated like real infrastructure: metered, budgeted, and optimized.

The teams that will adapt fastest are the ones that start building intuition for token economics the same way they built intuition for compute hour economics a decade ago. That means tracking what different models and workflows actually cost per session, setting budget caps before they become surprises, and being intentional about which tasks deserve premium model spend versus which can run on cheaper slots. It also means not panicking about the announcement — and not ignoring the actual numbers.

For Azure-adjacent practitioners, there is a secondary consideration. GitHub Copilot's billing model change is one data point in a broader shift toward consumption-based pricing for developer-facing AI tools. If your organization is standardizing on GitHub Copilot as its primary coding assistant, the finance team will start asking questions about variable AI costs the same way they ask about variable cloud compute costs. The right answer is not to resist that conversation. It is to lead it — with actual usage data, modeled projections, and a clear view of which Copilot workflows deliver enough value to justify premium pricing.

The Red Lobster analogy landed well in developer forums because it names the obvious: the all-you-can-eat model was never permanent. What matters now is not who got blamed for the change. It is whether your team has the visibility and controls in place to manage AI coding as a metered service rather than a fixed cost. Most teams do not yet. That is the actual work.

Sources: The Register, GitHub Blog, GitHub Copilot Billing Docs, Visual Studio Magazine

Sign up for more like this.