Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published 21 days ago • 222
TokensGen: Harnessing Condensed Tokens for Long Video Generation Paper • 2507.15728 • Published Jul 21 • 7
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining Paper • 2507.14119 • Published Jul 18 • 58
T-LoRA: Single Image Diffusion Model Customization Without Overfitting Paper • 2507.05964 • Published Jul 8 • 119
SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement Paper • 2506.07634 • Published Jun 9 • 6
Speechless: Speech Instruction Training Without Speech for Low Resource Languages Paper • 2505.17417 • Published May 23 • 14
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper • 2505.14669 • Published May 20 • 78
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers Paper • 2504.20752 • Published Apr 29 • 92
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published Apr 8 • 110
ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning Paper • 2406.14130 • Published Jun 20, 2024 • 10
Linear Transformers with Learnable Kernel Functions are Better In-Context Models Paper • 2402.10644 • Published Feb 16, 2024 • 81
Aligning Text-to-Image Diffusion Models with Reward Backpropagation Paper • 2310.03739 • Published Oct 5, 2023 • 22