research-catchup
updated
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
Paper
•
2508.01059
•
Published
•
32
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper
•
2508.01191
•
Published
•
238
On the Generalization of SFT: A Reinforcement Learning Perspective with
Reward Rectification
Paper
•
2508.05629
•
Published
•
180
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper
•
2508.06471
•
Published
•
195
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper
•
2508.10833
•
Published
•
44
Paper
•
2508.10104
•
Published
•
291
SSRL: Self-Search Reinforcement Learning
Paper
•
2508.10874
•
Published
•
97
Thyme: Think Beyond Images
Paper
•
2508.11630
•
Published
•
81
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper
•
2509.02547
•
Published
•
228