S.F.'s picture

S.F.

search-facility

·

ipv6

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Mode Seeking meets Mean Seeking for Fast Long Video Generation

upvoted a paper 5 days ago

Image Generation with a Sphere Encoder

upvoted a paper 5 days ago

JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Mode Seeking meets Mean Seeking for Fast Long Video Generation

Paper • 2602.24289 • Published 5 days ago • 32

upvoted 5 papers 5 days ago

Image Generation with a Sphere Encoder

Paper • 2602.15030 • Published 16 days ago • 15

JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation

Paper • 2602.19163 • Published 10 days ago • 14

Solaris: Building a Multiplayer Video World Model in Minecraft

Paper • 2602.22208 • Published 7 days ago • 27

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

Paper • 2602.12160 • Published 20 days ago • 37

GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL

Paper • 2602.22190 • Published 7 days ago • 15

upvoted 4 papers 13 days ago

COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression

Paper • 2602.15200 • Published 16 days ago • 7

Revisiting the Platonic Representation Hypothesis: An Aristotelian View

Paper • 2602.14486 • Published 16 days ago • 11

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published 15 days ago • 104

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

Paper • 2602.14111 • Published 17 days ago • 55

liked a model 15 days ago

shallowdream204/BitDance-14B-16x

Text-to-Image • 15B • Updated 14 days ago • 265 • 87

upvoted 3 papers 15 days ago

Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception

Paper • 2602.11858 • Published 20 days ago • 58

SemanticMoments: Training-Free Motion Similarity via Third Moment Features

Paper • 2602.09146 • Published 23 days ago • 21

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published 21 days ago • 237

upvoted 2 papers 19 days ago

TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

Paper • 2602.08711 • Published 23 days ago • 28

When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Paper • 2602.10560 • Published 21 days ago • 29

upvoted a paper 20 days ago

Prism: Spectral-Aware Block-Sparse Attention

Paper • 2602.08426 • Published 23 days ago • 36

upvoted a paper 22 days ago

F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published 26 days ago • 71

upvoted 2 papers 26 days ago

Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published Oct 23, 2025 • 50

ERNIE 5.0 Technical Report

Paper • 2602.04705 • Published 28 days ago • 261