codex

Copilot Chat’s PR Context Upgrade Is GitHub Admitting Review Needs a Workspace, Not a Chat Drawer

Anatoliy Kolodkin

04 Jun 2026 • 6 min read

GitHub’s latest Copilot Chat update looks small if you read it as a UI change. A button at the top of a diff. Chat beside code. Faster answers. Fine.

Read it as a review-system change and it gets more interesting. GitHub is making richer pull-request and diff context generally available on github.com for anyone with a Copilot license, after a public preview. The product direction is obvious: PR review is no longer just humans leaving comments on a diff, with AI living off to the side as a chatbot. GitHub is pulling code, discussion, summary, repository context, and AI-assisted comprehension into the same workspace.

That is the right move. It is also the point where teams need to stop treating Copilot-in-review as a novelty and start treating it as review infrastructure.

The diff is the right boundary

The feature gives reviewers a few entry points: the Ask about this diff button at the top of each diff, the Copilot button in GitHub’s top navigation, and line-level selection that can be sent to Copilot from a dropdown. GitHub says the experience is powered by new abilities for pull-request understanding, review, and summary, and that relevant PR and repository context is added when users ask about a diff or pull request.

That sounds obvious only because the old pattern was so awkward. A pull request is already a bundle of context: changed files, comments, commits, review threads, linked issues, CI status, repository conventions, and the shape of nearby code. Asking a detached chatbot to reason about a PR usually meant the reviewer became a context-packet courier: paste the diff, paste the failing test, explain the repo, add the relevant file, then hope the model noticed the line that mattered.

Putting Copilot next to the diff fixes the ergonomics. More importantly, it changes the unit of work. The question is no longer “can chat answer a code question?” The question is “can the review surface help a human ask narrower, better questions while staying anchored in the code?” That is a much healthier pattern.

A senior reviewer does not need an AI to solemnly announce that a PR “modifies authentication logic.” They need help with sharp, local questions: does this migration preserve the old nullable behavior, which callers still assume the deprecated field exists, what test path covers the new branch, why did this retry loop move after the transaction opens, and does this diff change the public API in a way the PR description did not admit? Those questions belong next to the code. A drawer chat that forces tab-switching is friction at exactly the wrong moment.

Richer context cuts both ways

The useful part of richer PR context is also the risky part: Copilot sees more. GitHub’s code review documentation now describes a broader agentic surface around reviews, including full project context gathering, suggestions that can be passed to Copilot cloud agent, Low and Medium review effort levels, MCP servers, repository agent skills, custom instructions, and Copilot Memory. Agentic review capabilities use GitHub Actions runners for work such as gathering project context and handing suggested fixes to the cloud agent. If Actions is unavailable or the relevant workflows fail, GitHub says reviews can still be generated, but without the additional agentic capabilities.

That is a lot of machinery attached to the humble PR. It means “Copilot reviewed this” may involve repository scanning, Actions runner usage, model calls, configured skills, memory, custom instructions, and possibly tool configuration that was originally justified for implementation work rather than review work. This is where teams need precision. Review is a privileged workflow because it sits near merge authority. Tool access that feels harmless in a coding assistant can become much more consequential when it is available in the review path.

MCP configuration is the obvious example. If a repository has MCP servers configured so agents can query service catalogs, docs, issue trackers, deployment metadata, or internal systems, teams should decide whether those tools belong in PR review. The answer may be yes for some repos. It should not be accidental. A review assistant that can inspect a service dependency graph is useful; a review assistant that can wander through operational systems because nobody separated implementation context from review context is governance by shrug.

Cost also moved into the review system. GitHub’s docs say Copilot code review consumes AI credits, and agentic review capabilities can consume GitHub Actions minutes. Medium review effort is in public preview and routes pull requests to a higher-reasoning model for longer analysis of complex logic, security-sensitive code, and cross-service changes. It also uses more AI credits and Actions minutes than Low. That is a perfectly reasonable tradeoff when the PR deserves it. It is an expensive default when teams enable it without knowing what changed.

The danger is authority theater

AI inside a PR review surface feels more authoritative than AI in a side chat. Same model class, different costume. If Copilot answers beside the diff, with repository context, in the same place where humans approve merges, its output inherits some of the page’s institutional gravity. That is product psychology, not model reliability.

GitHub is careful in its broader documentation: Copilot can identify issues and suggest fixes, but humans still need to validate the feedback. Teams should take that caveat seriously, because richer context can make wrong answers more persuasive. A model that has seen more of the repository may still miss a concurrency bug, invent a convention, misunderstand an authorization boundary, or recommend a “fix” that passes local reasoning and breaks production behavior. The higher the context quality, the more tempting it is to outsource judgment. That is exactly backwards.

The better use is controlled assistance. Ask Copilot to summarize risk by subsystem. Ask it to compare the diff against an existing caller pattern. Ask it to identify tests that should fail if the new behavior is wrong. Ask it to explain an unfamiliar file before you review the changed lines. Ask it to draft a checklist for a risky migration. Then do the review. Copilot should lower the cost of comprehension, not become a rubber stamp with better autocomplete.

There is also a reviewer-training angle here. Junior engineers often struggle with where to start on a large diff. A side-by-side Copilot surface can help them ask better questions and learn repository structure faster. But the team has to model the right behavior: narrow prompts, skepticism, verification, and follow-up in code/tests/docs rather than “Copilot said LGTM.” If AI review becomes a substitute for learning review judgment, the organization gets faster comments and weaker reviewers. Bad trade.

What teams should actually do

First, write down what Copilot Chat in PRs is for. Good uses: comprehension, test-gap suggestions, risk summaries, unfamiliar-code explanation, and targeted line-level questions. Bad uses: approval, ownership transfer, security signoff, or replacing domain reviewers on critical changes. If that sounds bureaucratic, congratulations, you have discovered that code review is a control plane.

Second, inspect the settings around automatic Copilot code review, review effort, MCP access, repository skills, and custom instructions. Low review effort is the sane default for routine work. Medium belongs on security-sensitive PRs, cross-service changes, and complex logic where extra latency and spend are justified. If Actions minutes matter in your organization, track Copilot review runs like any other CI-adjacent workload.

Third, keep suggested fixes on a short leash. Passing suggestions to Copilot cloud agent can be useful for mechanical changes, but it should not silently turn review feedback into write authority. Generated fix PRs still need normal CI, ownership, and human review. The easy path from “this comment found a bug” to “the agent opened a fix” is powerful. It is also how review tooling starts producing work faster than humans can understand it.

Finally, teach reviewers to ask smaller questions. The strongest version of this feature is not “review this PR.” It is “does this changed branch preserve retry semantics?”, “which test would catch a nil response here?”, “compare this new validator with the existing one in the billing service,” and “summarize the blast radius if this config default is wrong.” The review surface gets better when the human stays in the loop with intent.

GitHub is right to move Copilot closer to the diff. PR review is context work, and context belongs where the reviewer is already thinking. But richer context is not automatically safer review. It expands the power, cost, and confidence of the tool at the same time. Use it like a sharper instrument: deliberately, visibly, and with a human still holding the handle.

Sources: GitHub Changelog, GitHub Docs: Copilot code review, GitHub Docs: Copilot models and pricing

The diff is the right boundary

Richer context cuts both ways

The danger is authority theater

What teams should actually do

Sign up for more like this.