Jailbreak Distillation: Renewable Safety Benchmarking Paper • 2505.22037 • Published May 28, 2025 • 1
The Alignment Waltz: Jointly Training Agents to Collaborate for Safety Paper • 2510.08240 • Published Oct 9, 2025 • 41
Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models Paper • 2510.21978 • Published Oct 24, 2025 • 16
Reasoning over mathematical objects: on-policy reward modeling and test time aggregation Paper • 2603.18886 • Published 28 days ago • 6
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs Paper • 2603.09095 • Published Mar 10 • 29
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs Paper • 2603.09095 • Published Mar 10 • 29
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs Paper • 2603.09095 • Published Mar 10 • 29
Feedback by Design: Understanding and Overcoming User Feedback Barriers in Conversational Agents Paper • 2602.01405 • Published Feb 1 • 1
Feedback by Design: Understanding and Overcoming User Feedback Barriers in Conversational Agents Paper • 2602.01405 • Published Feb 1 • 1