MiCo: Multi-image Contrast for Reinforcement Visual Reasoning Paper • 2506.22434 • Published Jun 27 • 10
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper • 2507.13348 • Published Jul 17 • 77
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs Paper • 2510.18876 • Published Oct 21 • 36
Back to Basics: Let Denoising Generative Models Denoise Paper • 2511.13720 • Published 20 days ago • 63
Diversity Has Always Been There in Your Visual Autoregressive Models Paper • 2511.17074 • Published 16 days ago • 7
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation Paper • 2511.20256 • Published 12 days ago • 26
World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models Paper • 2511.22787 • Published 10 days ago • 8