view article Article Ulysses Sequence Parallelism: Training with Million-Token Contexts 4 days ago • 15
In-Context Reinforcement Learning for Tool Use in Large Language Models Paper • 2603.08068 • Published 4 days ago • 20
ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning Paper • 2603.10160 • Published 3 days ago • 20
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 3 days ago • 57
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing Paper • 2603.09877 • Published 3 days ago • 37
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data Paper • 2603.09206 • Published 3 days ago • 41
AutoResearch-RL: Perpetual Self-Evaluating Reinforcement Learning Agents for Autonomous Neural Architecture Discovery Paper • 2603.07300 • Published 6 days ago • 14
view article Article Granite 4.0 1B Speech: Compact, Multilingual, and Built for the Edge 4 days ago • 8
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs Paper • 2603.02083 • Published 11 days ago • 9
Progressive Residual Warmup for Language Model Pretraining Paper • 2603.05369 • Published 8 days ago • 32
Reasoning Models Struggle to Control their Chains of Thought Paper • 2603.05706 • Published 7 days ago • 26
DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval Paper • 2603.04743 • Published 8 days ago • 47
Large Multimodal Models as General In-Context Classifiers Paper • 2602.23229 • Published 15 days ago • 22
Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling Paper • 2603.04791 • Published 8 days ago • 16