brando/olympiad-bench-imo-math-boxed-825-v2-21-08-2024 Viewer • Updated Nov 6, 2024 • 1.65k • 101 • 5
Running on CPU Upgrade Featured 2.82k The Smol Training Playbook 📚 2.82k The secrets to building world-class LLMs
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 177
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning Paper • 2509.13305 • Published Sep 16, 2025 • 91
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published Aug 22, 2025 • 160