RLAIF/numina-math-llama-3.1-8b-bon-meta-cot
Viewer
•
Updated
•
680k
•
292
RLAIF/optim_policy_pretrain-pythia-160m_lr0.0001_bs24_wp1_wd0.01_ep0_cp35k-merged
Viewer
•
Updated
•
700k
•
54
RLAIF/TIR-Batched-PRM-Seed-Rollouts
Viewer
•
Updated
•
160k
•
21
RLAIF/dec_09_token_baseline_ds_math_llama_3_1_405b_tmp07_together
Viewer
•
Updated
•
2.5k
•
5
RLAIF/dec09_token_thinking_shrt_ds_math_llama_3_1_8b_instruc_tmp07
Viewer
•
Updated
•
2.5k
•
7
RLAIF/Value-v2-NUMINA-V2-Blocks-Merged-1999-problems-step-len-filtered
Viewer
•
Updated
•
32.3k
•
10
RLAIF/Value-v2-NUMINA-V2-Blocks-Merged-980-problems-step-len-filtered
Viewer
•
Updated
•
15.8k
•
12
RLAIF/Value-v1-NUMINA-V1-Blocks-Merged
Viewer
•
Updated
•
64k
•
7
RLAIF/NUMINA-V1-Blocks-Merged
Viewer
•
Updated
•
18.5M
•
5
RLAIF/Value-v1-NUMINA-V1-Blocks-Merged-3194-problems-step-len-filtered
Viewer
•
Updated
•
44.2k
•
14
RLAIF/Value-v1-NUMINA-V1-Blocks-Merged-2964-problems-step-len-filtered
Viewer
•
Updated
•
41k
•
10
RLAIF/Value-v1-NUMINA-V1-Blocks-Merged-1620-problems-step-len-filtered
Viewer
•
Updated
•
21.1k
•
11
RLAIF/CODE-BEHAVIOR-NUMINA-V1-Blocks
Viewer
•
Updated
•
20.9k
•
6
RLAIF/test_public_private
Viewer
•
Updated
•
1
•
6