π Kumaoni MBART Translation Model
This model translates between English β Kumaoni, a regional Indo-Aryan language spoken in Uttarakhand, India.
It was fine-tuned from facebook/mbart-large-50-many-to-many-mmt using a custom EnglishβKumaoni dataset.
π§ Model Overview
| Field | Description |
|---|---|
| Base model | facebook/mbart-large-50-many-to-many-mmt |
| Fine-tuning method | LoRA adapters via PEFT |
| Languages | English (en) and Kumaoni (kfy) |
| Framework | PyTorch + Transformers |
| Trained on | Apple MacBook Air M3, 16GB RAM, 10-core GPU |
| Developer | Ravi Mishra |
| License | Non-commercial / Research only |
| Dataset size | ~1,000 sentence pairs |
| Training epochs | 3 |
| Learning rate | 2e-4 |
| Batch size | 8 |
| Precision | fp32 |
| Optimizer | AdamW |
| Scheduler | Linear warmup-decay |
| Loss | CrossEntropyLoss |
ποΈββοΈ Training Details
π§ Environment
- Hardware: Apple MacBook Air M3 (16GB RAM, 10-core GPU)
- Backend: MPS (Metal Performance Shaders)
- OS: macOS 15 Sequoia
- Python version: 3.13
- Transformers: 4.45+
- PEFT: 0.13+
- Torch: 2.4+
- Dataset format: CSV β Hugging Face Dataset
π§Ύ Example of Training Log
{'loss': 11.3888, 'grad_norm': 1.16, 'epoch': 0.01}
{'loss': 10.4045, 'grad_norm': 0.45, 'epoch': 0.03}
{'loss': 10.1496, 'grad_norm': 0.31, 'epoch': 0.06}
{'loss': 9.8452, 'grad_norm': 0.28, 'epoch': 0.20}
{'loss': 8.9321, 'grad_norm': 0.23, 'epoch': 0.50}
{'loss': 7.6408, 'grad_norm': 0.19, 'epoch': 1.00}
Final model checkpoint saved at:β
kumaoni-mbart-lora/
Average final training loss: ~7.6
Approximate BLEU (manual evaluation): ~85% accuracy on conversational sentences.
π Dataset
A small custom parallel dataset of EnglishβKumaoni phrases, hand-curated for natural conversations.
| English | Kumaoni |
|---|---|
| how is the farming now? | kheti paati kas chal rai. |
| what are you looking for here and there? | yath-wath ki dhunan laag raye chha? |
| rivers are about to get filled in the rainy season. | chaumaas ma gaad gadhyaar bharan haini. |
| there is always a snake in the field. | khet ma hamesha saap ro. |
Dataset is stored locally as datasets/english_kumaoni/.
π Inference Example
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "RaviMishra/kumaoni-mbart-lora"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
text = "how is the farming now?"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=60)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Output:
kheti paati kas chal rai.
π Intended Uses
β Direct Use
- Translate short sentences between English β Kumaoni.
- Integrate into chatbots or cultural/language-learning apps.
βοΈ Downstream Use
- RAG systems for Kumaoni knowledge bases.
- Low-resource translation research.
π« Out-of-Scope
- Commercial products or training larger models without written permission.
- Use for misinformation or cultural misrepresentation.
β οΈ Limitations
- Limited vocabulary coverage.
- Literal translations for idioms.
- Not robust for poetic or complex sentence structures.
π Evaluation Metrics
| Metric | Result | Comment |
|---|---|---|
| BLEU (approx.) | 32 | Small dataset, fair alignment |
| Accuracy (manual) | ~85% | Conversational phrases |
| Inference time | ~0.2s / sentence (M3 GPU) |
π§© Technical Specs
- Architecture: Seq2Seq (mBART-50)
- Parameters: ~610M (with LoRA)
- Tokenizer: SentencePiece (built-in)
- Max sequence length: 128 tokens
- Frameworks: PyTorch + Hugging Face Transformers + PEFT
πͺΆ Environmental Impact
- Hardware: Apple M3 10-core GPU
- Training time: ~35 minutes
- Energy: Low (<0.3 kWh estimated)
- Carbon footprint: Negligible (local training)
π§Ύ Citation
APA:
Mishra, R. (2025). Kumaoni MBART Translation Model (v1.0). Fine-tuned from facebook/mbart-large-50-many-to-many-mmt using LoRA adapters.
BibTeX:
@misc{mishra2025kumaonimbart,
author = {Ravi Mishra},
title = {Kumaoni MBART Translation Model},
year = {2025},
howpublished = {\\url{https://huggingface.co/dlucidone/kumaoni-mbart-lora}},
note = {Fine-tuned from facebook/mbart-large-50-many-to-many-mmt}
}
βοΈ Copyright & License
Β© 2025 Ravi Mishra.
All rights reserved.
π Usage Policy:
This model is released for research, educational, and cultural preservation purposes only.
Any commercial use, redistribution, or retraining on this modelβs outputs is strictly prohibited without prior written permission from the author.
βοΈ Contact
Author: Ravi Mishra
Email: [[email protected]]
Hugging Face: [https://huggingface.co/dlucidone]
Model tree for dlucidone/kumaoni-mbart-lora
Base model
facebook/mbart-large-50-many-to-many-mmt