Instructions to use AIDC-AI/Ovis1.6-Gemma2-9B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AIDC-AI/Ovis1.6-Gemma2-9B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="AIDC-AI/Ovis1.6-Gemma2-9B", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("AIDC-AI/Ovis1.6-Gemma2-9B", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use AIDC-AI/Ovis1.6-Gemma2-9B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AIDC-AI/Ovis1.6-Gemma2-9B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AIDC-AI/Ovis1.6-Gemma2-9B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/AIDC-AI/Ovis1.6-Gemma2-9B

SGLang

How to use AIDC-AI/Ovis1.6-Gemma2-9B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AIDC-AI/Ovis1.6-Gemma2-9B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AIDC-AI/Ovis1.6-Gemma2-9B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AIDC-AI/Ovis1.6-Gemma2-9B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AIDC-AI/Ovis1.6-Gemma2-9B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use AIDC-AI/Ovis1.6-Gemma2-9B with Docker Model Runner:
```
docker model run hf.co/AIDC-AI/Ovis1.6-Gemma2-9B
```

4-bit Quantization (GPTQ or GGUF)

by ThetaCursed - opened Sep 26, 2024

Discussion

ThetaCursed

Sep 26, 2024

Are there plans to release this model in a 4-bit quantized version?

runninglsy

AIDC-AI org Sep 27, 2024

Currently, we do not have plans to develop a quantized version in the short term. However, we are working on training smaller models (e.g., 2~3B) to better meet different user needs and application scenarios.

ThetaCursed

Sep 27, 2024

It's very sad to know, because this model in the quantized version should fit in 12 GB VRAM.

Why release smaller models if you can make 4-bit quantization for this one and allow people to use it locally, considering the fact that the most popular selling video card model of all time is the GeForce RTX 3060 12 GB.

If you don't have a person who can handle this, then at least leave some instructions on how to do this, I will do it and share with the whole community.

aiPhone

Sep 28, 2024

This comment has been hidden

runninglsy

AIDC-AI org Sep 28, 2024

Thank you for the suggestion. Considering the community's feedback on the quantized version, we have decided to dedicate our efforts to developing it. We will strive to complete it within a month.

runninglsy

AIDC-AI org Nov 4, 2024

We've released quantized versions of Ovis1.6: Ovis1.6-Gemma2-9B-GPTQ-Int4 and Ovis1.6-Llama3.2-3B-GPTQ-Int4. Feel free to try them out and share your feedback!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment