92 69 184

Yaowei Zheng

hiyouga

https://github.com/hiyouga

AI & ML interests

LLM Training System

Recent Activity

posted an update 16 days ago

🚀 We're excited to support the ERNIE AI Developer Challenge! Fine-tune ERNIE with LLaMA-Factory and compete for $3,000 prizes by building the most impactful model — with submissions reviewed by the core developers of LLaMA-Factory. 👉 Join Now: https://baiduernieai.devpost.com/?utm_source=LLaMAFactory&utm_medium=partner&utm_campaign=ERNIE+AI+Developer+Challenge

reacted to jzhang533's post with 🔥 16 days ago

We’ve officially kicked off the ERNIE AI Developer Challenge! We want to create something interesting with you all, so we partnered with Unsloth, LLaMA-Factory, Novita AI, D-Robotics, and CAMEL-AI to empower your creativity. Come build with us: https://baiduernieai.devpost.com/?utm_source=ERNIE-HF&utm_medium=ERNIE-HF&utm_campaign=ERNIE+AI+Developer+Challenge

upvoted a paper 24 days ago

DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

View all activity

Organizations

upvoted a paper 24 days ago

DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

Paper • 2511.06307 • Published 28 days ago • 50

upvoted a paper 2 months ago

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26 • 70

upvoted a paper 3 months ago

Reverse-Engineered Reasoning for Open-Ended Generation

Paper • 2509.06160 • Published Sep 7 • 149

upvoted a changelog 4 months ago

Changelog

Trending Papers

Jul 28

• 104

upvoted 3 papers 4 months ago

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Paper • 2508.02193 • Published Aug 4 • 132

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Paper • 2508.02317 • Published Aug 4 • 20

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26 • 157

upvoted 5 papers 5 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 313

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology

Paper • 2507.07999 • Published Jul 10 • 49

RoboBrain 2.0 Technical Report

Paper • 2507.02029 • Published Jul 2 • 33

Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

Paper • 2504.13914 • Published Apr 10 • 4

Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents

Paper • 2507.04009 • Published Jul 5 • 51

upvoted a collection 5 months ago

tiny ramdom models

Collection

70 items • Updated 10 days ago • 6

upvoted a paper 5 months ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 240

upvoted an article 6 months ago

Article

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

Jun 21

•

upvoted a paper 6 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 187

upvoted 4 papers 7 months ago

Emerging Properties in Unified Multimodal Pretraining

Paper • 2505.14683 • Published May 20 • 134

AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection

Paper • 2505.07293 • Published May 12 • 27

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5 • 80

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 153