Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Julius-L 's Collections
inference acceleration
multimodal dataset
Generation
Long Context
Finetuning
Memory Efficient Training
Pretraining
Model Architecture
Model Merging
Sparsification
Quantization
LLM Technical Reports
Unseen Papers

multimodal dataset

updated Jan 20, 2025
Upvote
-

  • BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

    Paper • 2412.04626 • Published Dec 5, 2024 • 13

  • GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI

    Paper • 2411.14522 • Published Nov 21, 2024 • 37

  • Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

    Paper • 2411.03823 • Published Nov 6, 2024 • 49

  • Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

    Paper • 2410.18558 • Published Oct 24, 2024 • 18

  • Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

    Paper • 2501.05767 • Published Jan 10, 2025 • 29

  • Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

    Paper • 2412.05271 • Published Dec 6, 2024 • 159
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs