Helsinki-NLP/opus-100
Viewer • Updated • 55.1M • 30.4k • 237
How to use sanvo/vietnamese-nmt with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "translation" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("translation", model="sanvo/vietnamese-nmt") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("sanvo/vietnamese-nmt", dtype="auto")Fine-tuned facebook/nllb-200-distilled-600M (600M parameters) for trilingual translation between Vietnamese, English, and Japanese.
| Source | Target | Direction |
|---|---|---|
| Vietnamese | English | vi → en |
| English | Vietnamese | en → vi |
| Vietnamese | Japanese | vi → ja |
| Japanese | Vietnamese | ja → vi |
| English | Japanese | en → ja |
| Japanese | English | ja → en |
Includes a character-based language detection module using Unicode range analysis:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model = AutoModelForSeq2SeqLM.from_pretrained("sanvo/vietnamese-nmt")
tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
# Vietnamese → English
tokenizer.src_lang = "vie_Latn"
inputs = tokenizer("Xin chào, bạn khỏe không?", return_tensors="pt")
translated = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["eng_Latn"])
print(tokenizer.decode(translated[0], skip_special_tokens=True))
# Vietnamese → Japanese
translated = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["jpn_Jpan"])
print(tokenizer.decode(translated[0], skip_special_tokens=True))
Base model
facebook/nllb-200-distilled-600M