Make money doing the work you believe in

Researchers quietly slipped AI-written briefs into the Jessup International Law Moot Court—the world’s premier student advocacy contest—and those entries earned “average to near-perfect” marks for presentation. A closer examination, however, revealed invented citations and shaky reasoning, showing that polished form can mask flimsy substance.

This isn’t a universal failure, but it happens often enough. Benchmarks like AbstentionBench show models internally registering doubt while still providing confidently phrased answers. The good news is this issue should be fixable: models need fine-tuning to explicitly express uncertainty and confidence—otherwise, we risk hiding failures behind clever rhetoric.

Jun 19
at
4:40 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.