Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
guanning-ai
's Collections
Reasoning-Benchmarks
Reasoning-Benchmarks
updated
10 days ago
A collection of mutiple benchmarks for large reasoning model evaluation
Upvote
-
guanning/amc23
Viewer
•
Updated
May 25, 2025
•
40
•
1
guanning/math
Viewer
•
Updated
Jun 12, 2025
•
12.5k
•
5
guanning/aime24
Viewer
•
Updated
May 25, 2025
•
30
•
1
guanning/aime25
Viewer
•
Updated
May 25, 2025
•
30
•
1
guanning/gsm8k
Viewer
•
Updated
May 25, 2025
•
8.79k
•
7
guanning/olympiadbench
Viewer
•
Updated
May 28, 2025
•
675
•
41
guanning-ai/dapo17k
Viewer
•
Updated
Nov 9, 2025
•
17.2k
•
2
guanning-ai/dapo14k
Viewer
•
Updated
Jun 11, 2025
•
14k
•
3
guanning-ai/mmlu-pro
Viewer
•
Updated
Jul 4, 2025
•
12k
•
1
guanning-ai/knowlogic-en
Viewer
•
Updated
Jul 10, 2025
•
2.4k
•
1
guanning-ai/bigmath
Viewer
•
Updated
Jul 27, 2025
•
251k
•
1
guanning-ai/COM2
Viewer
•
Updated
Aug 6, 2025
•
3.76k
•
1
guanning-ai/beyondaime
Viewer
•
Updated
Oct 21, 2025
•
100
•
35
guanning-ai/Polaris-53K
Viewer
•
Updated
Dec 11, 2025
•
53.3k
•
1
guanning-ai/openr1-93K
Viewer
•
Updated
Dec 11, 2025
•
93.7k
•
2
guanning-ai/gsm8k-mugglemath
Viewer
•
Updated
Dec 27, 2025
•
157k
•
4
guanning-ai/gsm8k-metamath
Viewer
•
Updated
Dec 30, 2025
•
160k
•
12
guanning-ai/gsm8k-mumath
Viewer
•
Updated
Dec 27, 2025
•
92k
•
4
guanning-ai/minervamath
Viewer
•
Updated
Jan 2
•
272
•
4
guanning-ai/gsm8k-platinum
Viewer
•
Updated
Jan 7
•
1.21k
•
4
Upvote
-
Share collection
View history
Collection guide
Browse collections