FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading Paper • 2502.11433 • Published Feb 17, 2025 • 36
DPO-Shift: Shifting the Distribution of Direct Preference Optimization Paper • 2502.07599 • Published Feb 11, 2025 • 15
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28, 2025 • 123
view article Article SmolVLM Grows Smaller – Introducing the 256M & 500M Models! +1 Jan 23, 2025 • 189
Diffusion Model Alignment Using Direct Preference Optimization Paper • 2311.12908 • Published Nov 21, 2023 • 49
Facial Recognition Collection Face detection and recognition models that can be used for facial recognition in Immich. Models are sorted by size in descending order. • 4 items • Updated Nov 10, 2023 • 12
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models Paper • 2402.19481 • Published Feb 29, 2024 • 22