mimba/afrilang
Viewer • Updated • 97.5k • 70 • 1
This repository hosts a fine-tuned version of openai/whisper-medium adapted for Automatic Speech Recognition (ASR) from Ngiemboon (nnh).
mimba/whisper-ngiemboonopenai/whisper-medium| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 0.7846 | 1.0 | 589 | 0.7358 | 0.6419 |
| 0.5542 | 2.0 | 1178 | 0.5998 | 0.6358 |
| 0.4704 | 3.0 | 1767 | 0.5379 | 0.5331 |
| 0.4088 | 4.0 | 2356 | 0.5138 | 0.5010 |
| 0.3807 | 5.0 | 2945 | 0.4872 | 0.5061 |
| 0.3395 | 6.0 | 3534 | 0.4809 | 0.4807 |
| 0.3426 | 7.0 | 4123 | 0.4710 | 0.4997 |
| 0.3215 | 8.0 | 4712 | 0.4676 | 0.4730 |
| 0.3045 | 9.0 | 5301 | 0.4636 | 0.4844 |
| 0.2959 | 10.0 | 5890 | 0.4636 | 0.4744 |
from transformers import AutoProcessor, WhisperForConditionalGeneration
import torch
import soundfile as sf
# Load model and processor (depuis ton repo ou dossier local)
processor = AutoProcessor.from_pretrained("mimba/whisper-ngiemboon")
model = WhisperForConditionalGeneration.from_pretrained("mimba/whisper-ngiemboon")
# Load audio
speech, rate = sf.read("example_ngiemboon.wav")
# Préparer les features
inputs = processor(speech, sampling_rate=rate, return_tensors="pt")
# Predict
with torch.no_grad():
predicted_ids = model.generate(inputs["input_features"])
# Décoder la transcription
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)
@misc{
title={afrilang: Small Out-of-domain resource for various africain languages},
author={Mimba Ngouana Fofou},
year={2026},
howpublished={\\url{https://huggingface.co/mimba/whisper-ngiemboon}}
}
Base model
openai/whisper-medium