NEW research from CMU.
(bookmark this one)
The biggest unlock in coding agents is understanding strategies for how to run them asynchronously.
Simply giving a single agent more iterations helps, but does not scale well.
And multi-agent research shows that coordination > compute.
A new paper from CMU proves this with a practical multi-agent system.
CAID (Centralized Asynchronous Isolated Delegation) borrows proven human SWE practices: a manager builds a dependency graph, delegates tasks to engineer agents who work in isolated git worktrees, execute concurrently, self-verify with tests, and integrate via git merge.
CAID improves accuracy over single-agent baselines by 26.7% absolute on paper reproduction tasks (PaperBench) and 14.3% on the Python library
development tasks (Commit0).
The key insight is that isolation plus explicit integration beats both single-agent scaling and naive multi-agent approaches.
For long-horizon software engineering tasks, multi-agent coordination using git-native primitives should be the default strategy, not a fallback.
Paper: arxiv.org/abs/2603.21489
Learn to build effective AI agents in our academy: academy.dair.ai