This quote is from IFBench. Both a great benchmark and a nicely-done analysis of RLVR and generalization in the instruction following domain!
arxiv.org
A crucial factor for successful human and AI interaction is the ability of language models or chatbots to follow human instructions precisely. A common feature of instructions are output constraints like ``only answer with yes or no” or ``mention the word `abrakadabra’ at least 3 times” that the use…