Arina Puchkova
rinapch
AI & ML interests
NLP, RL
Recent Activity
upvoted
a
paper
about 1 month ago
The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N
Sampling via max@k Optimisation
upvoted
a
paper
2 months ago
PIPer: On-Device Environment Setup via Online Reinforcement Learning
upvoted
a
collection
7 months ago
Mellum