view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 • 481
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 18 days ago • 103
LLaVa-NeXT-Video Collection LLaVa-NeXT-Video extends LLaVa-NeXT for video understanding. • 5 items • Updated Jun 10, 2024 • 9
LLaVA-1.6 Collection A collection of LLaVA-1.6 checkpoints • 4 items • Updated Jan 31, 2024 • 75
LLaVA-Video Collection Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 8 items • Updated Feb 21, 2025 • 64
Robust Speech Recognition via Large-Scale Weak Supervision Paper • 2212.04356 • Published Dec 6, 2022 • 46
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 649
Sapiens Collection Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens • 72 items • Updated Sep 18, 2024 • 60
LLaVa-Interleave Collection LLaVa models that extends the model capabilities to Multi-image, Multi-frame (videos), Multi-patch (single-image) scenarios. • 3 items • Updated Jul 10, 2024 • 15
BLIP2 models Collection A collection of all BLIP2 models! • 5 items • Updated Oct 31, 2025 • 21