SWE-bench SWE-bench (Lite, Verified, Multimodal, Multilingual) all in one place! SWE-bench/SWE-bench_Verified Benchmark • Updated Feb 27 • 500 • 126k • 31 SWE-bench/SWE-bench_Multilingual Viewer • Updated Aug 26, 2025 • 300 • 16k • 7 SWE-bench/SWE-bench_Multimodal Viewer • Updated Apr 29, 2025 • 612 • 1.07k • 9 SWE-bench/SWE-bench_Lite Viewer • Updated Apr 29, 2025 • 323 • 28.8k • 12
SWE-agent-LM A collection of language models trained on SWE-smith + (mini-)SWE-agent for SWE-bench tasks SWE-bench/SWE-agent-LM-32B Text Generation • 33B • Updated May 12, 2025 • 397 • • 79 SWE-bench/SWE-agent-LM-7B Text Generation • 8B • Updated Jul 13, 2025 • 60.2k • 6 SWE-bench/SWE-Rater-32B 33B • Updated Jun 1, 2025 • 17 • 3
SWE-smith SWE-smith datasets of task instances for different programming languages SWE-bench/SWE-smith-py Viewer • Updated Dec 18, 2025 • 50.9k • 3.59k • 2 SWE-bench/SWE-smith-go Viewer • Updated Dec 18, 2025 • 8.21k • 1.89k SWE-bench/SWE-smith-rs Viewer • Updated Feb 6 • 5.31k • 1.75k • 1 SWE-bench/SWE-smith-cpp Viewer • Updated Mar 9 • 5.12k • 1.73k
SWE-bench SWE-bench (Lite, Verified, Multimodal, Multilingual) all in one place! SWE-bench/SWE-bench_Verified Benchmark • Updated Feb 27 • 500 • 126k • 31 SWE-bench/SWE-bench_Multilingual Viewer • Updated Aug 26, 2025 • 300 • 16k • 7 SWE-bench/SWE-bench_Multimodal Viewer • Updated Apr 29, 2025 • 612 • 1.07k • 9 SWE-bench/SWE-bench_Lite Viewer • Updated Apr 29, 2025 • 323 • 28.8k • 12
SWE-smith SWE-smith datasets of task instances for different programming languages SWE-bench/SWE-smith-py Viewer • Updated Dec 18, 2025 • 50.9k • 3.59k • 2 SWE-bench/SWE-smith-go Viewer • Updated Dec 18, 2025 • 8.21k • 1.89k SWE-bench/SWE-smith-rs Viewer • Updated Feb 6 • 5.31k • 1.75k • 1 SWE-bench/SWE-smith-cpp Viewer • Updated Mar 9 • 5.12k • 1.73k
SWE-agent-LM A collection of language models trained on SWE-smith + (mini-)SWE-agent for SWE-bench tasks SWE-bench/SWE-agent-LM-32B Text Generation • 33B • Updated May 12, 2025 • 397 • • 79 SWE-bench/SWE-agent-LM-7B Text Generation • 8B • Updated Jul 13, 2025 • 60.2k • 6 SWE-bench/SWE-Rater-32B 33B • Updated Jun 1, 2025 • 17 • 3