Glance: Accelerating Diffusion Models with 1 Sample Paper โข 2512.02899 โข Published 5 days ago โข 23
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper โข 2511.11434 โข Published 23 days ago โข 44
๐ฑ Sailor2 Language Models Collection Sailing in South-East Asia with Inclusive Multilingual LLMs โข 34 items โข Updated 18 days ago โข 30
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper โข 2511.02778 โข Published Nov 4 โข 102
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback Paper โข 2511.01678 โข Published Nov 3 โข 34
From Charts to Code: A Hierarchical Benchmark for Multimodal Models Paper โข 2510.17932 โข Published Oct 20 โข 7
Paper2Video: Automatic Video Generation from Scientific Papers Paper โข 2510.05096 โข Published Oct 6 โข 116
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper โข 2504.06148 โข Published Apr 8 โข 13
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper โข 2504.06148 โข Published Apr 8 โข 13
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper โข 2504.06148 โข Published Apr 8 โข 13 โข 2
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper โข 2503.20198 โข Published Mar 26 โข 4
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper โข 2503.20198 โข Published Mar 26 โข 4
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper โข 2503.20198 โข Published Mar 26 โข 4 โข 3