Your Agent's PRs Get Rejected Because They Ignore Your History: Learning Organic Commits via Repository Memory
Coding agents have made remarkable strides on benchmark leaderboards, yet teams deploying them in production keep running into the same quiet rejection: the AI-generated pull request technically works but doesn't fit. It duplicates an internal API that a teammate wrote two years ago. It ignores the project's established naming conventions. It violates an architectural constraint that was never written down anywhere — only embedded in the history of how the codebase evolved. A new paper from researchers including Mo Li and colleagues at the intersection of software engineering and AI defines this gap precisely with the term organicity: the degree to which generated code matches the project-specific change patterns of a real codebase, not just its current state.
The key insight is that handing an agent a repository snapshot is not enough. The snapshot shows what the code is today; it doesn't reveal how that code came to be — the patterns, preferences, and constraints accumulated through thousands of historical commits. To close this gap, the paper introduces Learning to Commit, a framework built around Online Repository Memory. The agent performs contrastive reflection on earlier commits: it attempts to resolve a historical issue blind, then compares its own prediction against the real oracle diff, and distils the difference into a growing library of repository-specific lessons — which APIs the project prefers, how functions get named, which architectural boundaries hold. This memory is built incrementally across the commit history and wraps any underlying coding agent without requiring model changes. Evaluation across multiple repositories shows measurable improvements in the organic quality dimensions that standard pass-rate benchmarks miss entirely.
For teams already running coding agents in GitHub Actions, CI-triggered pipelines, or Claude Code with AGENTS.md files, the practical takeaway is direct: repository memory is a first-class engineering artifact, not an afterthought. Giving an agent read access to the repo and hoping for the best skips the history layer that experienced human reviewers have internalized over years. Building a snapshot of that history as a queryable memory layer — and maintaining it as the codebase evolves — is what separates agents that get merged from agents that get declined.