view article Article Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation Sep 16 • 17
Running on CPU Upgrade Featured 2.72k The Smol Training Playbook 📚 2.72k The secrets to building world-class LLMs
view article Article MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era Jan 15 • 48
view article Article Low Latency CPU Based Educational Value Classifier With Generic Educational Value Jun 12, 2024 • 9
Running Featured 1.24k FineWeb: decanting the web for the finest text data at scale 🍷 1.24k Generate high-quality text data for LLMs using FineWeb
view article Article From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate +2 Jun 13, 2024 • 61
Runtime error Featured 151 Open LLM Progress Tracker 🔬 151 Visualize Open vs. Proprietary LLM Progress