This is a quantized variant of google/translategemma-4b-it, created by The Kaitchup (newsletter: https://kaitchup.substack.com).

More details (training recipe, benchmarks, and recommended settings) will be added later. In the meantime, here are the current notes and a working inference example.

Status / limitations

Quick smoke test only (not fully evaluated).
RoPE parameters were removed for compatibility with vLLM. As a result, long-context behavior may be degraded. I have not verified the impact yet.
Chat template not supported (for now). To use the model in vLLM, call a completions endpoint and provide a fully formatted prompt.

Serving with vLLM

vllm serve kaitchup/translategemma-4b-it-FP8-Dynamic  --max-model-len 2048   --chat-template-content-format openai --served-model-name  gemma

curl -s http://localhost:8000/v1/completions -H "Content-Type: application/json" -d '{
    "model": "gemma",
    "prompt": "<bos><start_of_turn>user\nYou are a professional French (fr) to English (en) translator. Your goal is to accurately convey the meaning and nuances of the original French text while adhering to English grammar, vocabulary, and cultural sensitivities.\nProduce only the English translation, without any additional explanations or commentary. Please translate the following French text into English:\n\n\nJaime les pâtes !<end_of_turn>\n<start_of_turn>model\n",
    "temperature": 0,
    "max_tokens": 200,
    "stop": ["<end_of_turn>"]
  }'

Downloads last month: 12

Safetensors

Model size

4B params

Tensor type

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kaitchup/translategemma-4b-it-FP8-Dynamic

Base model

google/translategemma-4b-it

Quantized

(14)

this model

Dataset used to train kaitchup/translategemma-4b-it-FP8-Dynamic

Collection including kaitchup/translategemma-4b-it-FP8-Dynamic

Quantized translategemma

Collection

Quickly tested with vLLM. Not fully compatible yet. • 7 items • Updated about 13 hours ago • 1