TDAD: Test-Driven Agentic Development — Reducing Code Regressions by 70%
Researchers have published a new tool called TDAD — Test-Driven Agentic Development — that takes a fundamentally different approach to keeping AI coding agents from breaking things they weren't supposed to touch. Instead of giving agents procedural instructions about how to test, TDAD builds a dependency graph that maps the relationships between source files and their corresponding tests, so that when an agent prepares to commit a change, it automatically knows which tests are relevant and verifies them before proceeding. Tested on the SWE-bench Verified benchmark with Qwen3-Coder 30B, the results are striking: regression rates dropped from 6.08% down to 1.82%, a reduction of roughly 70%.
The counterintuitive finding buried in the paper is worth dwelling on. When researchers tried simply adding detailed TDD instructions — telling agents step by step how to do test-driven development — regressions didn't improve. They got worse, climbing to 9.94%, nearly double the vanilla baseline. The implication is that procedural prompting is the wrong lever. What agents actually need isn't more instructions; it's better information about which tests are contextually relevant to the code they're changing.
TDAD is open source and submitted to ACM AIWare 2026, which means it's on a path toward peer review and likely wider adoption. For teams using AI agents in production codebases where regression safety is non-negotiable, this points toward a new category of "agent skills" — not prompts, but structured context tools — that could become standard infrastructure in the agentic coding stack.
Read the full article at arXiv (ACM AIWare 2026 submission) →