RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments Paper • 2604.26067 • Published 9 days ago • 73
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation Paper • 2604.24763 • Published 10 days ago • 68
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 10 days ago • 116
EditCrafter: Tuning-free High-Resolution Image Editing via Pretrained Diffusion Model Paper • 2604.10268 • Published 26 days ago • 12
Hierarchical Codec Diffusion for Video-to-Speech Generation Paper • 2604.15923 • Published 20 days ago • 2
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 22 days ago • 117
ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video Paper • 2604.07882 • Published 28 days ago • 9
Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting Paper • 2604.12626 • Published 23 days ago • 15
Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music Paper • 2604.10905 • Published 24 days ago • 28
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published 24 days ago • 71
Strips as Tokens: Artist Mesh Generation with Native UV Segmentation Paper • 2604.09132 • Published 27 days ago • 55
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Paper • 2604.06870 • Published 29 days ago • 41
FIT: A Large-Scale Dataset for Fit-Aware Virtual Try-On Paper • 2604.08526 • Published 28 days ago • 20
OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence Paper • 2604.07296 • Published 29 days ago • 39
AvatarPointillist: AutoRegressive 4D Gaussian Avatarization Paper • 2604.04787 • Published Apr 6 • 12
GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation Paper • 2603.26661 • Published Mar 27 • 26
DynaVid: Learning to Generate Highly Dynamic Videos using Synthetic Motion Data Paper • 2604.01666 • Published Apr 2 • 10