DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published 7 days ago • 68
FrankenMotion: Part-level Human Motion Generation and Composition Paper • 2601.10909 • Published 20 days ago • 18
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published 25 days ago • 210
3AM: Segment Anything with Geometric Consistency in Videos Paper • 2601.08831 • Published 23 days ago • 34
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking Paper • 2601.04720 • Published 28 days ago • 52
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 28 days ago • 219
MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding Paper • 2512.06581 • Published Dec 6, 2025 • 2
SOP: A Scalable Online Post-Training System for Vision-Language-Action Models Paper • 2601.03044 • Published 30 days ago • 28
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 56
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published Dec 4, 2025 • 170
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 255