Instructions to use arcee-ai/Arcee-Blitz with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use arcee-ai/Arcee-Blitz with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="arcee-ai/Arcee-Blitz")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("arcee-ai/Arcee-Blitz")
model = AutoModelForCausalLM.from_pretrained("arcee-ai/Arcee-Blitz")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use arcee-ai/Arcee-Blitz with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "arcee-ai/Arcee-Blitz"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Arcee-Blitz",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/arcee-ai/Arcee-Blitz

SGLang

How to use arcee-ai/Arcee-Blitz with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "arcee-ai/Arcee-Blitz" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Arcee-Blitz",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "arcee-ai/Arcee-Blitz" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Arcee-Blitz",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use arcee-ai/Arcee-Blitz with Docker Model Runner:
```
docker model run hf.co/arcee-ai/Arcee-Blitz
```

merge_method: arcee_fusion ?

by Undi95 - opened Feb 21, 2025

Discussion

Undi95

Feb 21, 2025

Hello!

I just saw you used a new merging method to have this model, what is arcee_fusion ?
I checked the mergekit github page, but I don't see anything about that, I'm curious!

Thank you!

MaziyarPanahi

Arcee AI org Feb 21, 2025

•

edited Feb 21, 2025

If I remember correctly it's a method to merge model weights by computing dynamic thresholds to identify important elements, then it's selectively merging these elements to create a fused model. I might be wrong though, you should check the source code: https://github.com/arcee-ai/mergekit/blob/main/mergekit/merge_methods/arcee_fusion.py

Crystalcareai

Arcee AI org Feb 22, 2025

We're writing a paper on it currently, but the link above from @MaziyarPanahi is accurate as to the implementations used.

Undi95

Feb 22, 2025

Thank you very much to you two!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment