Make money doing the work you believe in

NEW research from CMU.

(bookmark this one)

The biggest unlock in coding agents is understanding strategies for how to run them asynchronously.

Simply giving a single agent more iterations helps, but does not scale well.

And multi-agent research shows that coordination > compute.

A new paper from CMU proves this with a practical multi-agent system.

CAID (Centralized Asynchronous Isolated Delegation) borrows proven human SWE practices: a manager builds a dependency graph, delegates tasks to engineer agents who work in isolated git worktrees, execute concurrently, self-verify with tests, and integrate via git merge.

CAID improves accuracy over single-agent baselines by 26.7% absolute on paper reproduction tasks (PaperBench) and 14.3% on the Python library

development tasks (Commit0).

The key insight is that isolation plus explicit integration beats both single-agent scaling and naive multi-agent approaches.

For long-horizon software engineering tasks, multi-agent coordination using git-native primitives should be the default strategy, not a fallback.

Paper: arxiv.org/abs/2603.21489

Learn to build effective AI agents in our academy: academy.dair.ai

Mar 30
at
2:39 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.