Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model Paper • 2510.18855 • Published Oct 21 • 71
InfiMed-ORBIT: Aligning LLMs on Open-Ended Complex Tasks via Rubric-Based Incremental Training Paper • 2510.15859 • Published Oct 17 • 10