-
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
Paper • 2510.15444 • Published • 147 -
PICABench: How Far Are We from Physically Realistic Image Editing?
Paper • 2510.17681 • Published • 62 -
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 535
Collections
Discover the best community collections!
Collections including paper arxiv:2510.17681
-
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Paper • 2505.13227 • Published • 45 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 2.12k • 543 -
nvidia/OpenMathReasoning
Viewer • Updated • 5.68M • 14.4k • 366 -
Search Arena: Analyzing Search-Augmented LLMs
Paper • 2506.05334 • Published • 17
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
Paper • 2411.18499 • Published • 18 -
VLSBench: Unveiling Visual Leakage in Multimodal Safety
Paper • 2411.19939 • Published • 10 -
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?
Paper • 2412.02611 • Published • 26 -
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs
Paper • 2412.03205 • Published • 18
-
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
Paper • 2510.15444 • Published • 147 -
PICABench: How Far Are We from Physically Realistic Image Editing?
Paper • 2510.17681 • Published • 62 -
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 535
-
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Paper • 2505.13227 • Published • 45 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 2.12k • 543 -
nvidia/OpenMathReasoning
Viewer • Updated • 5.68M • 14.4k • 366 -
Search Arena: Analyzing Search-Augmented LLMs
Paper • 2506.05334 • Published • 17
-
GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
Paper • 2411.18499 • Published • 18 -
VLSBench: Unveiling Visual Leakage in Multimodal Safety
Paper • 2411.19939 • Published • 10 -
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?
Paper • 2412.02611 • Published • 26 -
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs
Paper • 2412.03205 • Published • 18
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23