arxiv:2501.03895
Yang Feng
fengyang0317
·
AI & ML interests
None yet
Organizations
None yet
models
10
fengyang0317/sft_output
Updated
fengyang0317/SmolLM2-FT-DPO
Text Generation
•
0.1B
•
Updated
•
6
fengyang0317/SmolLM2-FT-MyDataset
Text Generation
•
0.1B
•
Updated
•
3
fengyang0317/ppo-CartPole-v1
Reinforcement Learning
•
Updated
fengyang0317/unit4
Updated
fengyang0317/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning
•
Updated
•
4
fengyang0317/Taxi-v3
Reinforcement Learning
•
Updated
fengyang0317/q-FrozenLake-v1-4x4-noSlippery
Reinforcement Learning
•
Updated
fengyang0317/ppo-Huggy
Reinforcement Learning
•
Updated
•
25
fengyang0317/whisper-small-dv
Automatic Speech Recognition
•
0.2B
•
Updated
•
5
datasets
10
fengyang0317/commonsense
Viewer
•
Updated
•
10.6k
•
32
fengyang0317/prosqa
Viewer
•
Updated
•
18.7k
•
36
fengyang0317/prontoqa
Viewer
•
Updated
•
10k
•
72
fengyang0317/gsm8k
Viewer
•
Updated
•
387k
•
56
fengyang0317/listops-32
Viewer
•
Updated
•
100k
•
52
fengyang0317/listops-64
Viewer
•
Updated
•
100k
•
410
fengyang0317/listops-128
Viewer
•
Updated
•
100k
•
17
fengyang0317/listops-d20
Viewer
•
Updated
•
100k
•
11
fengyang0317/listops-1000
Viewer
•
Updated
•
100k
•
50
fengyang0317/imagenet-1k
Viewer
•
Updated
•
22
•
16