Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning Paper • 2512.15687 • Published 19 days ago • 18
GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators Paper • 2512.19682 • Published 14 days ago • 15
Reinforcement Learning for Self-Improving Agent with Skill Library Paper • 2512.17102 • Published 18 days ago • 30
view article Article Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance 27 days ago • 82
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published 13 days ago • 60
Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly? Paper • 2511.13646 • Published Nov 17, 2025 • 8
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 165
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning Paper • 2511.14460 • Published Nov 18, 2025 • 20
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published Nov 20, 2025 • 108
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents Paper • 2511.13593 • Published Nov 17, 2025 • 25
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation Paper • 2511.20256 • Published Nov 25, 2025 • 27
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24, 2025 • 60
Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning Paper • 2511.19900 • Published Nov 25, 2025 • 48
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory Paper • 2511.21678 • Published Nov 26, 2025 • 12
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models Paper • 2511.18890 • Published Nov 24, 2025 • 32
GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies Paper • 2512.02581 • Published Dec 2, 2025 • 14
InnoGym: Benchmarking the Innovation Potential of AI Agents Paper • 2512.01822 • Published Dec 1, 2025 • 35