Computer Use Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents Paper • 2504.00906 • Published Apr 1 • 26 The Unreasonable Effectiveness of Scaling Agents for Computer Use Paper • 2510.02250 • Published Oct 2 • 24
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents Paper • 2504.00906 • Published Apr 1 • 26
The Unreasonable Effectiveness of Scaling Agents for Computer Use Paper • 2510.02250 • Published Oct 2 • 24
Multimodal Reasoning A collection for Multimodal Reasoning Models and Benchmarks. Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models Paper • 2502.16033 • Published Feb 22 • 18 rippleripple/MMIR Viewer • Updated Feb 25 • 534 • 150 • 2 LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 130 GRIT: Teaching MLLMs to Think with Images Paper • 2505.15879 • Published May 21 • 12
Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models Paper • 2502.16033 • Published Feb 22 • 18
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 130
Computer Use Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents Paper • 2504.00906 • Published Apr 1 • 26 The Unreasonable Effectiveness of Scaling Agents for Computer Use Paper • 2510.02250 • Published Oct 2 • 24
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents Paper • 2504.00906 • Published Apr 1 • 26
The Unreasonable Effectiveness of Scaling Agents for Computer Use Paper • 2510.02250 • Published Oct 2 • 24
Multimodal Reasoning A collection for Multimodal Reasoning Models and Benchmarks. Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models Paper • 2502.16033 • Published Feb 22 • 18 rippleripple/MMIR Viewer • Updated Feb 25 • 534 • 150 • 2 LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 130 GRIT: Teaching MLLMs to Think with Images Paper • 2505.15879 • Published May 21 • 12
Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models Paper • 2502.16033 • Published Feb 22 • 18
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 130