Collections
Discover the best community collections!
Collections including paper arxiv:2511.13720
-
Back to Basics: Let Denoising Generative Models Denoise
Paper • 2511.13720 • Published • 63 -
Virtual Width Networks
Paper • 2511.11238 • Published • 35 -
Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs
Paper • 2511.07419 • Published • 25 -
When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs
Paper • 2511.02243 • Published • 24
-
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 165 -
Back to Basics: Let Denoising Generative Models Denoise
Paper • 2511.13720 • Published • 63 -
Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
Paper • 2512.04926 • Published • 28
-
Arbitrary-steps Image Super-resolution via Diffusion Inversion
Paper • 2412.09013 • Published • 13 -
Deep Researcher with Test-Time Diffusion
Paper • 2507.16075 • Published • 67 -
nablaNABLA: Neighborhood Adaptive Block-Level Attention
Paper • 2507.13546 • Published • 124 -
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 87
-
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Paper • 2506.09790 • Published • 53 -
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance
Paper • 2506.06444 • Published • 73 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 72 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 4
-
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper • 2412.13303 • Published • 72 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 53 -
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
Paper • 2509.12201 • Published • 104
-
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Paper • 2506.22434 • Published • 10 -
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
Paper • 2507.13348 • Published • 77 -
RewardDance: Reward Scaling in Visual Generation
Paper • 2509.08826 • Published • 73 -
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
Paper • 2510.18876 • Published • 36
-
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper • 2505.24864 • Published • 142 -
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development
Paper • 2506.05010 • Published • 79 -
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training
Paper • 2506.05301 • Published • 56 -
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning
Paper • 2505.16933 • Published • 34
-
Back to Basics: Let Denoising Generative Models Denoise
Paper • 2511.13720 • Published • 63 -
Virtual Width Networks
Paper • 2511.11238 • Published • 35 -
Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs
Paper • 2511.07419 • Published • 25 -
When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs
Paper • 2511.02243 • Published • 24
-
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 165 -
Back to Basics: Let Denoising Generative Models Denoise
Paper • 2511.13720 • Published • 63 -
Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
Paper • 2512.04926 • Published • 28
-
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper • 2412.13303 • Published • 72 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 116 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 53 -
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
Paper • 2509.12201 • Published • 104
-
Arbitrary-steps Image Super-resolution via Diffusion Inversion
Paper • 2412.09013 • Published • 13 -
Deep Researcher with Test-Time Diffusion
Paper • 2507.16075 • Published • 67 -
nablaNABLA: Neighborhood Adaptive Block-Level Attention
Paper • 2507.13546 • Published • 124 -
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 87
-
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Paper • 2506.22434 • Published • 10 -
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
Paper • 2507.13348 • Published • 77 -
RewardDance: Reward Scaling in Visual Generation
Paper • 2509.08826 • Published • 73 -
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
Paper • 2510.18876 • Published • 36
-
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Paper • 2506.09790 • Published • 53 -
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance
Paper • 2506.06444 • Published • 73 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 72 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 4
-
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper • 2505.24864 • Published • 142 -
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development
Paper • 2506.05010 • Published • 79 -
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training
Paper • 2506.05301 • Published • 56 -
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning
Paper • 2505.16933 • Published • 34