LLM 🦜
updated
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper
• 2412.13663
• Published
• 161
Paper
• 2412.15115
• Published
• 377
Are Your LLMs Capable of Stable Reasoning?
Paper
• 2412.13147
• Published
• 93
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper
• 2412.09871
• Published
• 108
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper
• 2412.10360
• Published
• 147
Expanding Performance Boundaries of Open-Source Multimodal Models with
Model, Data, and Test-Time Scaling
Paper
• 2412.05271
• Published
• 160
Enhancing Human-Like Responses in Large Language Models
Paper
• 2501.05032
• Published
• 61
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time
Scaling
Paper
• 2502.06703
• Published
• 152