Zuhao Yang's picture

Zuhao Yang

mwxely

·

https://mwxely.github.io/

AI & ML interests

Large Multimodal Models

Recent Activity

upvoted a paper 1 day ago

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

upvoted a paper 3 days ago

EgoX: Egocentric Video Generation from a Single Exocentric Video

upvoted a paper 9 days ago

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

View all activity

Organizations

upvoted a paper 1 day ago

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Paper • 2512.08765 • Published 23 days ago • 128

upvoted a paper 3 days ago

EgoX: Egocentric Video Generation from a Single Exocentric Video

Paper • 2512.08269 • Published 23 days ago • 115

upvoted 2 papers 9 days ago

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Paper • 2512.17532 • Published 13 days ago • 64

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published 10 days ago • 61

upvoted 3 papers 16 days ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 269

Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Paper • 2511.14993 • Published Nov 19, 2025 • 227

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 210

upvoted 2 collections 27 days ago

Multimodal Agent

126 items • Updated about 7 hours ago • 1

AI Paper of the Day

A collection of papers that I think are interesting, one added each day • 554 items • Updated about 18 hours ago • 74

upvoted a collection about 1 month ago

LongVT-HF_Daily_Paper

1 item • Updated Dec 1, 2025 • 1

upvoted 2 papers about 1 month ago

Script: Graph-Structured and Query-Conditioned Semantic Token Pruning for Multimodal Large Language Models

Paper • 2512.01949 • Published about 1 month ago • 8

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 181

upvoted a collection about 1 month ago

LongVT

8 items • Updated 21 days ago • 8

upvoted a paper about 1 month ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 92

upvoted a paper 3 months ago

First Try Matters: Revisiting the Role of Reflection in Reasoning Models

Paper • 2510.08308 • Published Oct 9, 2025 • 24

upvoted a paper 5 months ago

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19, 2025 • 134

upvoted a paper 8 months ago

100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models

Paper • 2505.00551 • Published May 1, 2025 • 36