2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published Jan 1, 2025 • 109
Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review Paper • 2502.16586 • Published Feb 23, 2025 • 1