When I wear the consulting quant hat and it is obvious that
we can’t measure something important properly, or that
our metrics are all based on violated assumptions,
a sound qualitative evaluation is way more preferable than some easily quantified numbers. But it can be hard to convince decision makers not to go with the numbers that look nice in favor of a sound argument without numbers.
A colleague of mine once spoke to Yoav Benjamini and was shocked to hear him suggest that a scientist should evaluate how good their biological clustering was qualitatively rather than relying on some number (probably one not backed by sound theory). I’m like yes, even the inventor of false discovery control, or especially someone like him would definitely not advocate blind adherence to numbers. Statisticians are the ultimate critical number theorists and number refutationists.
So, this retrospective from OpenAI is a good one. Do take sound theoretically guided qualitative evaluations from experts with LOTS of tacit knowledge seriously.