I’m publishing a long-form overview of using rubrics for RL tomorrow. Here are all of the papers that it will cover. Am I missing anything?
[1] Gunjal, Anisha, et al. "Rubrics as rewards: Reinforcement learning beyond verifiable domains." arXiv preprint arXiv:2507.17746 (2025).
[2] Huang, Zenan, et al. "Reinforcement learning with rubric anchors." arXiv preprint arXiv:2508.12790 (2025).
[3] Liu, Tianci, et al. "Openrubrics: Towards scalable synthetic rubric generation for reward modeling and llm alignment." arXiv preprint arXiv:2510.07743 (2025).
[4] Shao, Rulin, et al. "Dr tulu: Reinforcement learning with evolving rubrics for deep research." arXiv preprint arXiv:2511.19399 (2025).
[5] Xu, Ran, et al. "Alternating Reinforcement Learning for Rubric-Based Reward Modeling in Non-Verifiable LLM Post-Training." arXiv preprint arXiv:2602.01511 (2026).
[6] Xu, Wenyuan, et al. "A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization." arXiv preprint arXiv:2504.04950 (2025).
[7] Zheng, Lianmin, et al. "Judging llm-as-a-judge with mt-bench and chatbot arena." Advances in neural information processing systems 36 (2023): 46595-46623.
[8] Viswanathan, Vijay, et al. "Checklists are better than reward models for aligning language models." arXiv preprint arXiv:2507.18624 (2025).
[9] Mu, Tong, et al. "Rule based rewards for language model safety." Advances in Neural Information Processing Systems 37 (2024): 108877-108901.
[10] Gupta, Taneesh, et al. "CARMO: Dynamic Criteria Generation for Context Aware Reward Modelling." Findings of the Association for Computational Linguistics: ACL 2025. 2025.
[11] Wu, Mian, et al. "Rlac: Reinforcement learning with adversarial critic for free-form generation tasks." arXiv preprint arXiv:2511.01758 (2025).
[12] Xie, Lipeng, et al. "Auto-rubric: Learning to extract generalizable criteria for reward modeling." arXiv preprint arXiv:2510.17314 (2025).
[13] Bai, Yuntao, et al. "Constitutional ai: Harmlessness from ai feedback." arXiv preprint arXiv:2212.08073 (2022).
[14] Guan, Melody Y., et al. "Deliberative alignment: Reasoning enables safer language models." arXiv preprint arXiv:2412.16339 (2024).
[15] Liu, Yang, et al. "G-eval: NLG evaluation using gpt-4 with better human alignment." arXiv preprint arXiv:2303.16634 (2023).
[16] Arora, Rahul K., et al. "Healthbench: Evaluating large language models towards improved human health." arXiv preprint arXiv:2505.08775 (2025).
[17] Deshpande, Kaustubh, et al. "Multichallenge: A realistic multi-turn conversation evaluation benchmark challenging to frontier llms." Findings of the Association for Computational Linguistics: ACL 2025. 2025.