Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection Paper • 2512.16905 • Published 14 days ago • 30
MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives Paper • 2512.14699 • Published 16 days ago • 27
Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model Paper • 2512.13507 • Published 17 days ago • 38
DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning Paper • 2512.12799 • Published 18 days ago • 10
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published Nov 25, 2025 • 32
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published Oct 27, 2025 • 177
PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning Paper • 2510.13809 • Published Oct 15, 2025 • 37
RewardDance: Reward Scaling in Visual Generation Paper • 2509.08826 • Published Sep 10, 2025 • 73