SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents Paper • 2505.20411 • Published May 26 • 91
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 259
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model Paper • 2509.09372 • Published Sep 11 • 239
A Survey of Data Agents: Emerging Paradigm or Overstated Hype? Paper • 2510.23587 • Published Oct 27 • 65
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction Paper • 2510.22706 • Published Oct 26 • 39
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Paper • 2510.25726 • Published Oct 29 • 45
Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine Paper • 2510.21614 • Published Oct 24 • 22
PyTorch Distributed: Experiences on Accelerating Data Parallel Training Paper • 2006.15704 • Published Jun 28, 2020 • 3
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel Paper • 2304.11277 • Published Apr 21, 2023 • 4
SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement Paper • 2506.07634 • Published Jun 9 • 6
Zep: A Temporal Knowledge Graph Architecture for Agent Memory Paper • 2501.13956 • Published Jan 20 • 7
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 303
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning Paper • 2510.22543 • Published Oct 26 • 10