When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method Paper • 2402.17193 • Published Feb 27, 2024 • 26
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper • 2410.23743 • Published Oct 31, 2024 • 64
Direct Preference Optimization Using Sparse Feature-Level Constraints Paper • 2411.07618 • Published Nov 12, 2024 • 17
Control LLM: Controlled Evolution for Intelligence Retention in LLM Paper • 2501.10979 • Published Jan 19, 2025 • 6
Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published Jun 1, 2025 • 38
Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs Paper • 2506.05629 • Published Jun 5, 2025 • 37
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24, 2025 • 15
Towards a Unified View of Large Language Model Post-Training Paper • 2509.04419 • Published Sep 4, 2025 • 76
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels Paper • 2509.16596 • Published Sep 20, 2025 • 14
Interactive Training: Feedback-Driven Neural Network Optimization Paper • 2510.02297 • Published Oct 2, 2025 • 43
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published Oct 21, 2025 • 113
π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models Paper • 2510.25889 • Published Oct 29, 2025 • 66
ROOT: Robust Orthogonalized Optimizer for Neural Network Training Paper • 2511.20626 • Published Nov 25, 2025 • 43
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper • 2602.01058 • Published 10 days ago • 39
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration Paper • 2602.01734 • Published 9 days ago • 31