-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 95 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 36 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
Collections
Discover the best community collections!
Collections including paper arxiv:2601.01425
-
Textbooks Are All You Need
Paper • 2306.11644 • Published • 154 -
Self-Improving VLM Judges Without Human Annotations
Paper • 2512.05145 • Published • 20 -
FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing
Paper • 2601.01720 • Published • 6 -
MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Paper • 2511.09067 • Published • 2
-
Arrexel/pattern-diffusion
Text-to-Image • Updated • 602 • 110 -
flymy-ai/qwen-image-realism-lora
Text-to-Image • Updated • 1.55k • • 129 -
QuantStack/Wan2.2-Fun-A14B-Control-GGUF
Text-to-Video • 14B • Updated • 3.51k • 33 -
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Paper • 2601.01425 • Published • 52
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 18 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 9 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
-
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Paper • 2601.01425 • Published • 52 -
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment
Paper • 2601.20218 • Published • 15 -
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space
Paper • 2602.02092 • Published • 18 -
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation
Paper • 2602.03796 • Published • 57
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 10 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 17 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 17 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 25
-
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
Paper • 2404.15275 • Published -
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models
Paper • 2403.13535 • Published • 23 -
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper • 2309.05793 • Published • 51 -
GHOST 2.0: generative high-fidelity one shot transfer of heads
Paper • 2502.18417 • Published • 67
-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 95 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 36 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
-
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Paper • 2601.01425 • Published • 52 -
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment
Paper • 2601.20218 • Published • 15 -
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space
Paper • 2602.02092 • Published • 18 -
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation
Paper • 2602.03796 • Published • 57
-
Textbooks Are All You Need
Paper • 2306.11644 • Published • 154 -
Self-Improving VLM Judges Without Human Annotations
Paper • 2512.05145 • Published • 20 -
FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing
Paper • 2601.01720 • Published • 6 -
MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique
Paper • 2511.09067 • Published • 2
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 10 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 17 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 17 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 25
-
Arrexel/pattern-diffusion
Text-to-Image • Updated • 602 • 110 -
flymy-ai/qwen-image-realism-lora
Text-to-Image • Updated • 1.55k • • 129 -
QuantStack/Wan2.2-Fun-A14B-Control-GGUF
Text-to-Video • 14B • Updated • 3.51k • 33 -
DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Paper • 2601.01425 • Published • 52
-
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
Paper • 2404.15275 • Published -
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models
Paper • 2403.13535 • Published • 23 -
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper • 2309.05793 • Published • 51 -
GHOST 2.0: generative high-fidelity one shot transfer of heads
Paper • 2502.18417 • Published • 67
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 18 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 9 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13