PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published 24 days ago • 75
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published Oct 20 • 120 • 5
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published Oct 20 • 120
Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs Paper • 2510.11062 • Published Oct 13 • 28
Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing Paper • 2508.09192 • Published Aug 8 • 30
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 130
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published Feb 14 • 55
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile Paper • 2502.06155 • Published Feb 10 • 10
LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers Paper • 2310.03294 • Published Oct 5, 2023 • 2
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena Paper • 2306.05685 • Published Jun 9, 2023 • 38
Evaluating the Robustness of Text-to-image Diffusion Models against Real-world Attacks Paper • 2306.13103 • Published Jun 16, 2023 • 2
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving Paper • 2401.09670 • Published Jan 18, 2024 • 2