codex

GitHub Removes Gemini and GPT-5.2-Codex From Copilot Web Chat, Which Is What Governance Looks Like When the Dropdown Gets Too Big

Anatoliy Kolodkin

20 May 2026 • 6 min read

GitHub just did the thing every serious AI platform eventually has to do: it made the model dropdown smaller.

That sounds like anti-news in a week full of model launches, premium multipliers, agent APIs, and coding-assistant positioning. It is not. GitHub’s decision to remove all Gemini models, GPT-5.2 Codex, GPT-5.4 nano, and several other options from Copilot Chat on the web is a useful reminder that multi-model products do not mature by adding every possible model to every possible surface. They mature by deciding which models belong where, which defaults are defensible, and which choices create more support burden than user value.

The official changelog is short. GitHub says it has “updated our available model selection for Copilot Chat on the web to deliver more consistent, high-quality responses.” It says model choice is valuable, but the list on github.com is being limited so GitHub can “consistently ensure reliable responses.” OpenAI and Claude models across price points remain available across Copilot plans, while future web chat will support “a more limited set of new model rollouts” as GitHub works to ensure optimal performance.

The awkward part is the timing. One day earlier, GitHub announced Gemini 3.5 Flash as generally available for Copilot across major IDE surfaces, with a tentative 14x premium request multiplier. Now GitHub is removing Gemini models from Copilot Chat on the web. That is not necessarily contradiction. It is surface-specific governance. But if your internal AI tooling guidance still says “Copilot supports Gemini” as though Copilot were one product surface with one model menu, it is already stale.

Copilot used to be easier to describe. It completed code, then chatted in the IDE, then expanded into pull requests, cloud agents, CLI workflows, Spaces, Spark, and OpenAI Codex-powered tasks. Each new surface added a different kind of work: Q&A, patch generation, repository navigation, issue handoff, background execution, code review, and agentic task delegation. Those are not the same workload. They should not automatically share the same model policy.

Web chat is a broad, casual surface. People ask explanatory questions, poke at repositories, summarize things, compare approaches, and sometimes use it as a general-purpose technical assistant. Agent mode, cloud agent, and Codex tasks are narrower and more operational: they inspect code, run tools, produce diffs, and consume premium requests in ways that can hit billing and governance controls. A model that makes sense inside an agentic IDE workflow may be a poor fit for generic web chat if its quality profile, latency, cost, or support behavior is uneven.

That is the charitable read of GitHub’s move, and it is probably mostly right. A long dropdown looks powerful until users pick models without understanding tradeoffs, get inconsistent answers, and file tickets against “Copilot” rather than the particular model/surface pairing that failed. At enterprise scale, too much choice becomes operational noise. The admin console, support docs, billing model, and developer training all have to explain what each option means.

There is also a less flattering but equally practical read: model partnerships and rollouts are moving faster than product teams can make them feel coherent. GitHub has OpenAI models, Claude models, Gemini models, Codex-specific models, nano/mini/flash/high-end variants, premium request multipliers, and plan-specific availability. At some point, the dropdown stops being empowerment and starts being a leak from the vendor routing layer into the user experience. Removing models from web chat is GitHub putting a product boundary back where one probably belongs.

GPT-5.2 Codex disappearing from web chat is not Codex disappearing

The nuance practitioners need to catch is that GPT-5.2 Codex being removed from Copilot Chat on the web does not mean Codex is gone from Copilot. GitHub’s OpenAI Codex documentation still lists Codex agent models separately, including Auto, GPT-5.2-Codex, GPT-5.3-Codex, GPT-5.4, and GPT-5.4 nano for OpenAI Codex coding-agent tasks. In other words: the model is being removed from one surface, not erased from the platform.

That distinction is now central to administering AI coding tools. “Available in Copilot” is no longer a complete sentence. Available where? Web chat? VS Code chat? Agent mode? Cloud agent? Copilot CLI? Code review? Spaces? Spark? OpenAI Codex VS Code integration? A third-party agent plugged into Copilot billing? Each surface may have different model choices, different admin policies, different request accounting, and different user expectations.

GitHub’s own billing docs make the direction explicit: Copilot is moving from request-based billing to usage-based billing starting June 1, 2026, and premium requests can be consumed across Copilot Chat, CLI, code review, cloud agent, Spaces, Spark, OpenAI Codex VS Code integration, and third-party agents. That means the model menu is not just a UX detail. It is a cost-control surface.

The 14x Gemini 3.5 Flash multiplier announced for Copilot IDE surfaces is the sharp example. A model with a high multiplier might be completely rational for a hard debugging session if it avoids ten failed attempts with cheaper models. It might be irresponsible as a casual web-chat default for “explain this regex.” A 0.33x mini model might be ideal for small chores and wrong for ambiguous architecture review. Model governance is not about choosing the best model. It is about choosing the smallest sufficient model for each task class, then giving experts escape hatches with observability.

Admins need an inventory, not a screenshot

The practical response for engineering organizations is boring and necessary: inventory model availability by surface. Make a matrix. Rows should be the surfaces your team actually uses: github.com Copilot Chat, VS Code, JetBrains, CLI, cloud agent, code review, Spaces, Spark, OpenAI Codex agent tasks, and any third-party agents wired into Copilot. Columns should include available models, default model, admin policy owner, billing unit or multiplier, approved use cases, restricted use cases, logging/telemetry posture, and update cadence.

If that sounds heavy, consider the alternative. A developer follows internal guidance written last week, opens web chat, and the model is gone. Another developer sees Gemini in the IDE and assumes it is approved for all Copilot workflows. A team enables Auto without understanding whether Auto optimizes for availability, rate-limit relief, quality, cost, or some product-specific mix. Finance sees a premium-usage spike after June 1 and nobody can map it back to a workflow. That is not governance. That is model roulette with an invoice.

Teams should also stop evaluating model brands in isolation. The relevant unit is task plus surface plus policy. “Gemini for Copilot” is vague. “Gemini 3.5 Flash for agentic IDE tasks in a pilot group, with cost dashboarding and comparison against GPT-5.4 and Claude Sonnet on failing-test repair” is useful. “GPT-5.2-Codex for Codex agent review tasks, not web chat” is useful. “Auto is allowed for low-risk chores but not security-sensitive review” is useful. The language has to get more precise because the products already have.

There is a developer-experience case for GitHub’s simplification, too. Power users dislike losing options, and sometimes they are right. They may know a specific model handles a specific language, repo shape, or explanation style better. But the median engineer should not need to understand a vendor model portfolio to ask a question about a pull request. Defaults should be boring. Expert controls should exist where they produce better outcomes. Web chat is probably one of the surfaces where GitHub wants fewer, more predictable choices.

The broader competitive picture is also becoming clearer. GitHub is turning Copilot into a multi-provider enterprise routing layer. OpenAI is pushing Codex as an agent platform across CLI, app, browser, cloud tasks, code review, and automations. Anthropic’s Claude Code remains strong in terminal-native workflows. Cursor keeps pressure on the tight editor loop. The winning product will not simply be the one with the most models in the picker. It will be the one that makes model routing legible enough that teams can trust it.

That is why this small changelog deserves attention. It is a sign of the next phase. AI coding platforms will add models, remove models, split models by surface, hide models behind Auto, attach multipliers, deprecate options, and change defaults as quality and economics shift. The operational question is whether those changes are visible before developers discover them mid-task.

The editorial take: GitHub shrinking the Copilot web-chat model list is not retreat. It is the boring governance move multi-model platforms need. Infinite choice is not a strategy. The serious strategy is routing the right model to the right workflow, making the cost visible, and keeping the default experience stable enough that developers can do their jobs instead of babysitting a dropdown.

Sources: GitHub Changelog — Updates to available models in Copilot on web, GitHub Docs — Supported AI models in Copilot, GitHub Docs — Requests in GitHub Copilot, GitHub Docs — OpenAI Codex powered by Copilot, GitHub Changelog — Gemini 3.5 Flash for Copilot

The model menu is now a product boundary

GPT-5.2 Codex disappearing from web chat is not Codex disappearing

Admins need an inventory, not a screenshot

Sign up for more like this.