GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators Paper • 2512.19682 • Published 9 days ago • 15
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published Nov 12, 2025 • 68
Demystifying Reinforcement Learning in Agentic Reasoning Paper • 2510.11701 • Published Oct 13, 2025 • 31
Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models Paper • 2509.06949 • Published Sep 8, 2025 • 55
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published Jun 23, 2025 • 29
Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning Paper • 2506.03136 • Published Jun 3, 2025 • 25
Training-free Diffusion Acceleration with Bottleneck Sampling Paper • 2503.18940 • Published Mar 24, 2025 • 12
Temporal Consistency for LLM Reasoning Process Error Identification Paper • 2503.14495 • Published Mar 18, 2025 • 11
WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes Paper • 2503.13435 • Published Mar 17, 2025 • 18
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation Paper • 2502.12148 • Published Feb 17, 2025 • 17
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening Paper • 2502.12146 • Published Feb 17, 2025 • 16
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates Paper • 2502.06772 • Published Feb 10, 2025 • 21
ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization Paper • 2502.04306 • Published Feb 6, 2025 • 20
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights Paper • 2410.09008 • Published Oct 11, 2024 • 17