1 14 17

Zhenyi Shen

zen-E

https://www.zhenyishen.com/

AI & ML interests

LLM Reasoning

Recent Activity

updated a model 4 days ago

zen-E/off-policy_student-qwen3-1b-base_teacher-qwen25-math-1b_math_e1

published a model 4 days ago

zen-E/off-policy_student-qwen3-1b-base_teacher-qwen25-math-1b_math_e1

updated a model 7 days ago

zen-E/grpo_nokl_qwen3_1b_e20_last_ckpt

View all activity

Organizations

None yet

updated a model 4 days ago

zen-E/off-policy_student-qwen3-1b-base_teacher-qwen25-math-1b_math_e1

Updated 4 days ago

published a model 4 days ago

zen-E/off-policy_student-qwen3-1b-base_teacher-qwen25-math-1b_math_e1

Updated 4 days ago

updated a model 7 days ago

zen-E/grpo_nokl_qwen3_1b_e20_last_ckpt

Updated 7 days ago

published a model 7 days ago

zen-E/grpo_nokl_qwen3_1b_e20_last_ckpt

Updated 7 days ago

updated a model 8 days ago

zen-E/grpo_nokl_qwen3_1b_e20

Updated 8 days ago

published a model 8 days ago

zen-E/grpo_nokl_qwen3_1b_e20

Updated 8 days ago

upvoted 2 papers 24 days ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 187

NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning

Paper • 2505.16022 • Published May 21 • 4

upvoted a paper 26 days ago

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

Paper • 2511.20102 • Published Nov 25 • 26

upvoted 2 papers 29 days ago

DeepSeek-OCR: Contexts Optical Compression

Paper • 2510.18234 • Published Oct 21 • 84

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Paper • 2511.08892 • Published Nov 12 • 201

upvoted a paper about 1 month ago

Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance

Paper • 2510.03528 • Published Oct 3 • 17

authored a paper about 1 month ago

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

Paper • 2511.20102 • Published Nov 25 • 26

upvoted a paper about 1 month ago

SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation

Paper • 2412.13649 • Published Dec 18, 2024 • 21

commented a paper about 1 month ago

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

Paper • 2511.20102 • Published Nov 25 • 26 •

upvoted an article about 2 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

739

upvoted a paper 2 months ago

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Paper • 2510.11052 • Published Oct 13 • 51

liked a model 3 months ago

tencent/Youtu-Embedding

updated a model 4 months ago

zen-E/llama1be4mask0p4

1B • Updated Sep 11

published a model 4 months ago

zen-E/llama1be4mask0p4

1B • Updated Sep 11

Zhenyi Shen

AI & ML interests

Recent Activity

Organizations

zen-E's activity

SmolLM3: smol, multilingual, long-context reasoner