The app for independent voices

Every day, 100+ people ask me, "How can I learn AI evals?"

I copy-paste these 10 links (every time):

Using LLM-as-a-judge: hamel.dev/blog/posts/ll…

Demystifying evals for AI agents: anthropic.com/engineeri…

There are only 6 RAG Evals: jxnl.co/writing/2025/05…

Evaluation-driven development: decodingai.com/p/stop-l…

Binary evals vs. Likert scales: decodingai.com/p/the-5-…

The mirage of generic AI metrics: decodingai.com/p/the-mi…

Error analysis: youtube.com/watch?v=e2i…

Carrying out error analysis: youtube.com/watch?v=JoA…

Evaluating the effectiveness of LLM-evaluators: eugeneyan.com/writing/l…

LLM judges aren't the shortcut you think: youtube.com/watch?v=sEM…

Binge these to skyrocket your skills.

Feb 9
at
8:54 AM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.