Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization Paper • 2407.06129 • Published Jul 8, 2024 • 1
Running Agents 24 Croissant Checker - Dev 🔎 24 Validate Croissant dataset files for NeurIPS submissions
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 7 days ago • 145
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published 6 days ago • 264
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published Dec 31, 2025 • 109
ClawGym: A Scalable Framework for Building Effective Claw Agents Paper • 2604.26904 • Published 11 days ago • 49
AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation Paper • 2604.18240 • Published 20 days ago • 16
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents Paper • 2604.18543 • Published 20 days ago • 27
ClawEnvKit Collection Scalable Environment Generation for Claw-Like Agents • 3 items • Updated 18 days ago
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents Paper • 2604.18543 • Published 20 days ago • 27
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published 20 days ago • 84