Swee.lol - a sweelol Collection

sweelol 's Collections

Swee.lol

updated Nov 20, 2025

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 144
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development

Paper • 2506.05010 • Published Jun 5, 2025 • 80
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

Paper • 2506.05301 • Published Jun 5, 2025 • 59
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

Paper • 2505.16933 • Published May 22, 2025 • 34
Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14, 2025 • 126
MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21, 2025 • 98
The Diffusion Duality

Paper • 2506.10892 • Published Jun 12, 2025 • 37
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Paper • 2506.14429 • Published Jun 17, 2025 • 44
Sekai: A Video Dataset towards World Exploration

Paper • 2506.15675 • Published Jun 18, 2025 • 66
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs

Paper • 2506.19290 • Published Jun 24, 2025 • 53
Unified Vision-Language-Action Model

Paper • 2506.19850 • Published Jun 24, 2025 • 28
LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

Paper • 2410.16236 • Published Oct 21, 2024
PresentAgent: Multimodal Agent for Presentation Video Generation

Paper • 2507.04036 • Published Jul 5, 2025 • 11
VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents

Paper • 2507.04590 • Published Jul 7, 2025 • 17
AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs

Paper • 2507.05687 • Published Jul 8, 2025 • 30
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

Paper • 2508.04825 • Published Aug 6, 2025 • 60
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

Paper • 2508.09736 • Published Aug 13, 2025 • 58
XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization

Paper • 2508.10395 • Published Aug 14, 2025 • 42
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6, 2025 • 129
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 509
bageldotcom/paris

Text-to-Image • Updated Oct 7, 2025 • 19 • 37
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published Oct 9, 2025 • 126
TiDAR: Think in Diffusion, Talk in Autoregression

Paper • 2511.08923 • Published Nov 12, 2025 • 128
Back to Basics: Let Denoising Generative Models Denoise

Paper • 2511.13720 • Published Nov 17, 2025 • 69