TAUR-dev/M-0903_rl_reflect__0epoch_alltask__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__1d_3args__grpo_minibs32_lr1e-6_rollout16-rl
TAUR-dev/M-0903_rl_reflect__1f_3args__grpo_minibs32_lr1e-6_rollout16-rl
TAUR-dev/M-0903_rl_reflect__1c_3args__grpo_minibs32_lr1e-6_rollout16-rl
TAUR-dev/M-0903_rl_reflect__1e_3args__grpo_minibs32_lr1e-6_rollout16-rl
TAUR-dev/M-0903_rl_reflect__1a_1epoch_3args__grpo_minibs32_lr1e-6_rollout16-rl
TAUR-dev/M-0903_rl_reflect__1b_3args__grpo_minibs32_lr1e-6_rollout16-rl
TAUR-dev/M-skillfactory_yolo_3b-sft
TAUR-dev/M-skillfactory_yolo_2b-sft
TAUR-dev/M-skillfactory_yolo_3a_sft
TAUR-dev/M-skillfactory_yolo_2a-sft
2B • Updated • 1
TAUR-dev/M-skillfactory_yolo_1d_sft-sft
TAUR-dev/M-skillfactory_yolo_1a_1epoch_sft-sft
TAUR-dev/M-skillfactory_yolo_1e_sft-sft
2B • Updated • 2
TAUR-dev/M-skillfactory_yolo_1b_sft-sft
TAUR-dev/M-skillfactory_yolo_1f_sft-sft
TAUR-dev/M-skillfactory_yolo_1a_sft-sft
TAUR-dev/M-skillfactory_yolo_1c_sft-sft
TAUR-dev/M-9_2_25__yolo_run-sft
TAUR-dev/M-0827_rl_reflect_countdown__test-rl
0.6B • Updated TAUR-dev/M-jack_experiments__all_stages_tacc2-rl
0.6B • Updated TAUR-dev/M-jack_experiments__all_stages_tacc-rl
0.6B • Updated TAUR-dev/M-0827_rl_reflect_countdown__0epoch_3and4args__grpo_minibs32_lr1e-6_rolloutn16-rl
2B • Updated TAUR-dev/M-0827_rl_reflect_countdown__0epoch_4args__grpo_minibs32_lr1e-6_rolloutn16-rl
2B • Updated TAUR-dev/M-0827_rl_reflect_countdown__2epoch_3args__grpo_minibs32_lr1e-6_rolloutn16-rl
2B • Updated TAUR-dev/M-0827_rl_reflect_countdown__0epoch_3args__grpo_minibs32_lr1e-6_rolloutn16-rl
2B • Updated TAUR-dev/M-0827_rl_reflect_countdown__4epoch_4args__grpo_minibs32_lr1e-6_rolloutn16-rl
2B • Updated • 1
TAUR-dev/M-0827_rl_reflect_countdown__2epoch_3and4args__grpo_minibs32_lr1e-6_rolloutn16-rl
2B • Updated TAUR-dev/M-reflection_countdown_4args_sft_4epochs-sft
2B • Updated TAUR-dev/M-reflection_countdown_3args_sft_2epochs-sft
2B • Updated