NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation Paper β’ 2601.02204 β’ Published 20 days ago β’ 60
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time Paper β’ 2512.25075 β’ Published 25 days ago β’ 15
Pretraining Frame Preservation in Autoregressive Video Memory Compression Paper β’ 2512.23851 β’ Published 27 days ago β’ 24
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper β’ 2512.08765 β’ Published Dec 9, 2025 β’ 132
LucidFlux: Caption-Free Universal Image Restoration via a Large-Scale Diffusion Transformer Paper β’ 2509.22414 β’ Published Sep 26, 2025 β’ 22
Reconstruction Alignment Improves Unified Multimodal Models Paper β’ 2509.07295 β’ Published Sep 8, 2025 β’ 40
Waver: Wave Your Way to Lifelike Video Generation Paper β’ 2508.15761 β’ Published Aug 21, 2025 β’ 36
VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory Paper β’ 2506.18903 β’ Published Jun 23, 2025 β’ 22
Training-Free Efficient Video Generation via Dynamic Token Carving Paper β’ 2505.16864 β’ Published May 22, 2025 β’ 24
Long-Context Autoregressive Video Modeling with Next-Frame Prediction Paper β’ 2503.19325 β’ Published Mar 25, 2025 β’ 73
Edit Transfer: Learning Image Editing via Vision In-Context Relations Paper β’ 2503.13327 β’ Published Mar 17, 2025 β’ 29
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models Paper β’ 2503.12885 β’ Published Mar 17, 2025 β’ 43
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster Paper β’ 2503.09662 β’ Published Mar 12, 2025 β’ 33
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Paper β’ 2503.05978 β’ Published Mar 7, 2025 β’ 36
RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers Paper β’ 2502.15894 β’ Published Feb 21, 2025 β’ 20
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Paper β’ 2502.17258 β’ Published Feb 24, 2025 β’ 79