Multi-Objective Task-Aware Predictor for Image-Text Alignment Paper • 2510.00766 • Published Oct 1 • 2
World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models Paper • 2511.22787 • Published 12 days ago • 8
Can Large Language Models Infer and Disagree Like Humans? Paper • 2305.13788 • Published May 23, 2023
Stable Language Model Pre-training by Reducing Embedding Variability Paper • 2409.07787 • Published Sep 12, 2024
Diffusion Models Through a Global Lens: Are They Culturally Inclusive? Paper • 2502.08914 • Published Feb 13
When Tom Eats Kimchi: Evaluating Cultural Bias of Multimodal Large Language Models in Cultural Mixture Contexts Paper • 2503.16826 • Published Mar 21
Can LVLMs and Automatic Metrics Capture Underlying Preferences of Blind and Low-Vision Individuals for Navigational Aid? Paper • 2502.14883 • Published Feb 15
Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions Paper • 2503.13369 • Published Mar 17 • 7