Swee.lol
updated
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in
Large Language Models
Paper
• 2505.24864
• Published
• 144
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow
Development
Paper
• 2506.05010
• Published
• 80
SeedVR2: One-Step Video Restoration via Diffusion Adversarial
Post-Training
Paper
• 2506.05301
• Published
• 59
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning
Paper
• 2505.16933
• Published
• 34
Large Language Diffusion Models
Paper
• 2502.09992
• Published
• 126
MMaDA: Multimodal Large Diffusion Language Models
Paper
• 2505.15809
• Published
• 98
Paper
• 2506.10892
• Published
• 37
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper
• 2506.14429
• Published
• 44
Sekai: A Video Dataset towards World Exploration
Paper
• 2506.15675
• Published
• 66
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in
LLMs
Paper
• 2506.19290
• Published
• 53
Unified Vision-Language-Action Model
Paper
• 2506.19850
• Published
• 28
LLaVA-KD: A Framework of Distilling Multimodal Large Language Models
Paper
• 2410.16236
• Published
PresentAgent: Multimodal Agent for Presentation Video Generation
Paper
• 2507.04036
• Published
• 11
VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and
Visual Documents
Paper
• 2507.04590
• Published
• 17
AutoTriton: Automatic Triton Programming with Reinforcement Learning in
LLMs
Paper
• 2507.05687
• Published
• 30
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality,
Long Context, and Next Generation Agentic Capabilities
Paper
• 2507.06261
• Published
• 67
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional
Virtual Try-On and Try-Off
Paper
• 2508.04825
• Published
• 60
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with
Long-Term Memory
Paper
• 2508.09736
• Published
• 58
XQuant: Breaking the Memory Wall for LLM Inference with KV Cache
Rematerialization
Paper
• 2508.10395
• Published
• 42
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent
Distillation and Agentic RL
Paper
• 2508.13167
• Published
• 129
Less is More: Recursive Reasoning with Tiny Networks
Paper
• 2510.04871
• Published
• 509
Text-to-Image
• Updated
• 19
• 37
Thinking with Camera: A Unified Multimodal Model for Camera-Centric
Understanding and Generation
Paper
• 2510.08673
• Published
• 126
TiDAR: Think in Diffusion, Talk in Autoregression
Paper
• 2511.08923
• Published
• 128
Back to Basics: Let Denoising Generative Models Denoise
Paper
• 2511.13720
• Published
• 69