LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published 19 days ago • 77
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story Paper • 2511.15210 • Published Nov 19 • 89
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17 • 136
Enhancing Vision-Language Model Training with Reinforcement Learning in Synthetic Worlds for Real-World Success Paper • 2508.04280 • Published Aug 6 • 35
Vadim21221/sae__mount_path_qwen2.5-7b_resid_post_layer_14_size_163840_mul_fractal_topk Updated Jul 29 • 2
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_131072_mul_fractal_jumprelu Updated Jul 29 • 4
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_131072_mul_fractal_jumprelu Updated Jul 29 • 4
Vadim21221/sae__mount_path_qwen2.5-7b_resid_post_layer_14_size_163840_mul_fractal_topk Updated Jul 29 • 2
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_65536_mul_fractal_jumprelu Updated Jul 29 • 4
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_65536_mul_fractal_jumprelu Updated Jul 29 • 4
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_131072_mul_fractal_jumprelu Updated Jul 29 • 4
Vadim21221/sae__mount_path_qwen2.5-7b_resid_post_layer_14_size_163840_mul_fractal_topk Updated Jul 29 • 2