Make money doing the work you believe in

Interested in learning how to run RL at scale? Here are the best resources to read…

Research on Scaling RL

  1. The Art of Scaling RL compute for LLMs: arxiv.org/abs/2510.13786

  2. Scaling Behaviors of LLM RL Post-Training: arxiv.org/abs/2509.25300

  3. Optimally Scaling Sampling Compute for LLM RL: arxiv.org/abs/2603.12151

  4. Scaling up RL: arxiv.org/abs/2507.12507

  5. ProRL V2 - Prolonged Training Validates RL Scaling Laws: hijkzzz.notion.site/pro…

  6. Polaris - A Recipe for Scaling RL with Reasoning Models: hkunlp.github.io/blog/2…

RL Frameworks

  1. Hybrid Flow (outline of the verl framework): arxiv.org/abs/2409.19256

    1. More up-to-date info can be found here: arxiv.org/abs/2601.18150

  2. AReal - Large-Scale Async RL: arxiv.org/abs/2505.24298

  3. PipelineRL - Fast On-Policy RL: arxiv.org/abs/2509.19128

  4. AsyncFlow - Async Streaming RL: arxiv.org/abs/2507.01663

RL for Agents

  1. DeepSWE - Open Coding Agent Trained w/ RL: together.ai/blog/deepswe

  2. AutoForge - Environment Synthesis for Agentic RL: arxiv.org/abs/2512.22857

  3. Agent-R1 - Training Agents w/ End-to-End RL: arxiv.org/abs/2511.14460

  4. AgentRL - Scaling RL for Multi-Turn, Multi-Task Agents: arxiv.org/abs/2510.04206

  5. The Landscape of Agentic RL: arxiv.org/abs/2509.02547

  6. Training SWE Agents with RL: arxiv.org/abs/2508.03501

Case Studies & Tech Reports

  1. Kimi tech reports:

    1. Kimi K2 - Open Agentic Intelligence: arxiv.org/abs/2507.20534

    2. Kimi End-to-end Agentic RL: moonshotai.github.io/Ki…

    3. Kimi K1.5 - Scaling RL for LLMs: arxiv.org/abs/2501.12599

  2. Composer series from Cursor:

    1. Composer 2: arxiv.org/abs/2603.24477

    2. Composer 2.5: cursor.com/blog/compose…

  3. Olmo 3 (also has open code / data): arxiv.org/abs/2512.13961

  4. MiniMax tech reports:

    1. MiniMax-M2: arxiv.org/abs/2605.26494

    2. MiniMax-M1: arxiv.org/abs/2506.13585

  5. Nemotron 3 (NVIDIA): arxiv.org/abs/2512.20856

Jun 2
at
3:13 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.