WhisperKit
WhisperKit is an on-device speech recognition framework for Apple Silicon: https://github.com/argmaxinc/WhisperKit
Check out the WhisperKit paper and presentation from ICML 2025: https://icml.cc/virtual/2025/47854
For real-time streaming API, custom vocabulary, speaker diarization, and more, check out Argmax SDK: https://www.argmaxinc.com/blog/argmax-sdk-2
Evaluation: openai_whisper-large-v3-v20240930_turbo_632MB
Transcription test results for the turbo 632MB model from this repo (aoiandroid/whisperkit-coreml).
Environment
| Item | Value |
|---|---|
| Platform | macOS 14.x (arm64, Apple Silicon) |
| WhisperKit | argmaxinc/WhisperKit 0.15.0+ (Swift Package) |
| Model repo | aoiandroid/whisperkit-coreml |
| Test date | 2026-03-17 |
| Audio formats | m4a, mp3, wav, flac |
Test results (14 files, multi-language)
| File | Language / Content | Note |
|---|---|---|
| English.mp3 | English | Texas travel narration (Gage Hotel, Padre Island, Corpus Christi, seafood); stable long-form transcription |
| Euskara.mp3 | Basque | Speech on language and identity |
| Guaraniฬ.mp3 | Guarani | Short speech |
| Yoruฬbaฬ.mp3 | Yoruba | Education and future |
| afrikaasns.mp3 | Afrikaans | Value of learning a new language |
| arabic.mp3 | Arabic | Speech on hope and future (full Arabic) |
| bengali.m4a | Bengali | Some mixed-language / recognition errors |
| chinese.mp3 | Chinese | Long explanation on smart traffic systems |
| isiZulu.mp3 | isiZulu | Future, education, youth |
| kiswahili.mp3 | Kiswahili | Unity (umoja) |
| korean.mp3 | Korean | "On challenge" (๋์ ์ ๋ํ์ฌ) |
| russinan.m4a | Russian | RussiaโLatin America parliamentary conference (with some English at end) |
| test.mp3 | Japanese | Typhoon 14 news; high accuracy |
| ๆฅๆฌ่ช.mp3 | Japanese | Ostrich facts / comedy; high accuracy |
Actual transcription results (STT output)
Below is the actual speech-to-text output from the model for each test file. Long transcriptions are truncated; full text is in eval_logs/ in this repo.
English.mp3
I'm Tara Kirschner. I shoot photos all over the world, and I've been blown away by what I've seen so far on my first trip to Texas. I've paddled canyons, hiked through history, and photographed stars in one of the darkest skies in the country. After two incredible days, I decide to head down to Corpus Christi. But before I do, I want to make a quick stop in the charming little town of Marathon and check out its iconic attraction, the Gage Hotel. The Gage Hotel is very West Texas. Built in 1927 by local ranching tycoon Alfred Gage, who owned half a million acres of prime grazing land. You can really tell that it's been there since the 1920s. There's a lot of history on the walls. I feel like I'm stepping into a movie. Oh wow, this is cool. Marathon and the surrounding area are rich in art and culture. And I found the same spirit at the Gage Hotel. like this art installation I found just outside, where even the smallest details tell a bigger story. It's the kind of place I'd love to shoot. I could have just stayed there all day taking pictures. I'm ready to eat, so I head over to the V6 coffee shop and order the migas. Thank you. You're welcome. They're the migas. Enjoy. A local specialty with eggs, cheese, pico, and tortilla, all mixed together. Tex-Mex food is my jam. Literally, put anything in a tortilla and I'm happy. How was everything? It was amazing, thank you so much. Yeah, migas are very popular here. I bet. Are you from here? Yes. Oh, wow. Born and raised, yes we are. Wow. One of the things I noticed is that the art scene here is unreal. Has it always been like that? I feel like it recently just kind of took over, kind of here, Alpine, Marfa area. I'm obsessed. I walk like three feet and take a photo, and three more feet and take a photo. Awesome. Well, thank you for coming. Thank you. And safe travels to Corpus. Thank you. My next stop, Corpus Christi. My adventure here starts at sunrise. Here on Padre Island, I'm meeting January from Horse House. And we're about to go ride horses on the beach.
[... full text in eval_logs/ ...]
test.mp3 (Japanese)
้ๅปๆๅผทใฏใฉในใฎๅฐ้ขจ14ๅทใ่ฟใฅใใฆใใฆใใพใใไนๅทๅ้จใใใใฆๅ้จใฎ็ใใใไธญๅฟใซ้ฒ้ขจใ้ซๆณขใ้ซๆฝฎใๅคง้จใชใฉใซๆๅคง็ดใฎ่ญฆๆใใใฆใใ ใใใ
ๆฅๆฌ่ช.mp3 (Japanese)
ใใใงใฆใฎ้ ญใๆชใใใใใฎใขใใฟใใใซ็่ตฐใใฆใใ็ใ็ฉใฏใใใงใฆใจใใไฝ้ท2.5mไฝ้150kgใ่ถ
ใใ่ฆๆ ผๅคใฎใตใคใบใงใใใใฆใใฎๅทจไฝใงๆ้70kgใง่ตฐใใใจใใงใใพใใใฎในใใผใใง่ตฐใใ็ใ็ฉใงใใๆฐๅฐใชใใฎใงใใใใใงใฆใฏ้ฉใใใจใซๆ้60kgไปฅไธไฟใฃใใพใพ1ๆ้่ตฐใใใจใใงใใพใใใๆฐใฅใใฆใใๆนใใใใใใใใพใใใใใใงใฆใฏใใซใใฉใฝใณใชใ42ๅใงๅฎ่ตฐใใพใใพใๅฅณๆงใๆงใใใปใฉใฎใใตใใตใพใคใใจใใใใชใใ็ณใฏ5ใญใญๅ
ใฎใใฎใ่ช่ญใงใไธ็ไธ่ฆๅใฎ่ฏใๅ็ฉใจใ่จใใใฆใใพใใใใพใงใงใใใงใฆใฎๅชใใ่บซไฝ่ฝๅใๅใใใจๆใใพใใๆใๅชใใฆใใใฎใฏๅๅพฉๅใจๅ
็ซๅใฎ้ซใใงใใใใงใฆใฏใใใฏใฝๅถๆดใชใฎใงใใๆชๆใใใใฎๅทๅฃใใซใฉในใชใฉใซใคใคใใใฆ้ชจใไธธ่ฆใใซใชใใใจใใใใพใใใใใใใงใฆใฏ็ใใใใจใๆฐใซใใใใจใใชใ1ใถๆใใใใฐๅ
จใฆๅ็ใใพใใพใใฉใใปใฉๅทใใงใใใใจใๅ
็ซๅใ็ฐๅธธใซ้ซใใใๆๆ็ใซใใใใใจใใชใ็
ๆฐใงๆญปใฌใใจใฏใใใพใใใใใปใฉใซใพใงๅชใใ่บซไฝ่ฝๅใๆใฃใฆใใใใใงใฆใงใใไธใคใ ใ่ดๅฝ็ใชๆฌ ็นใใใใพใใใใฏๅฅ่ทก็ใช้ ญใฎๆชใใงใใใใงใฆใฏๅทจไฝใจใฏ่ฃ่
นใซ่ณใฟใใฏใใใฟใตใคใบใฎ40gใปใฉใจ้ๅธธใซๅฐใใใทใฏใใใใพใใใใฎใใ่จๆถๅใๅฃๆป
็ใงใๅบๆฌ็ใซไฝใ่ฆใใใใพใใๅฎถๆใฎ้กใ่ฆใใใใชใใฎใงๅฎถๆใๅ
ฅใๆฟใใฃใใๆธใฃใใใใฆใๆฐใฅใใพใใใใใซไบบใ่ไธญใซ้ฃใณไนใฃใฆใใใฎใใจใไธ็ฌใงๅฟใใใใไบบใ่ไธญใซไนใใใพใพๆฎ้ใซ็ๆดปใใ ใใพใใพใ่ใใใใจใใงใใชใใใไธ็พฝใ่ตฐใๅบใใใใใฃใใฎใใใซๅ
จๅกใ่ตฐใๅบใใพใใชใ่ตฐใใฎใใฏๆฌไบบใใกใๅใใฃใฆใใพใใใใใใขใใฟใใใช่บซไฝ่ฝๅใจๅฅ่ทก็ใช้ ญใฎๆชใใๅ
ผใญๅใใ็ใ็ฉใใใงใฆใชใฎใงใ
korean.mp3 (Korean)
๋์ ์ ๋ํ์ฌ ์ฌ๋ฌ๋ถ ์๋
ํ์ญ๋๊น? ์ค๋ ์ ๋ ๋์ ์ด๋ผ๋ ์ฃผ์ ์ ๋ํด ์ด์ผ๊ธฐํ๊ณ ์ ํฉ๋๋ค. ์ฐ๋ฆฌ๋ ์ด์๊ฐ๋ฉด์ ํฌ๊ณ ์์ ์ ํ์ ์๊ฐ์ ๋ง์ดํฉ๋๋ค. ๊ทธ ์ ํ ์์์ ์ฐ๋ฆฌ๋ ์ข
์ข
๋๋ ค์์ ๋๋๋๋ค. ์คํจํ๋ฉด ์ด๋ป๊ฒ ํ ๊น? ์ฌ๋๋ค์ด ๋๋ฅผ ์ด๋ป๊ฒ ๋ณผ๊น? ๋๋ ๊ณผ์ฐ ์ํ ์ ์์๊น? ํ๋ ๊ฑฑ์ ์ด ์ฐ๋ฆฌ๋ฅผ ๋ง์ค์ด๊ฒ ๋ง๋ญ๋๋ค. ํ์ง๋ง ๋์ ํ์ง ์์ผ๋ฉด ์๋ฌด๊ฒ๋ ๋ณํ์ง ์์ต๋๋ค. ๋์ ์ ์ฐ๋ฆฌ์ ๊ฐ๋ฅ์ฑ์ ๋ฐ๊ฒฌํ๊ฒ ํด์ฃผ๋ ์ฒซ๊ฑธ์์
๋๋ค. ๋น๋ก ๊ฒฐ๊ณผ๊ฐ ์ฐ๋ฆฌ๊ฐ ์ํ๋ ๋ฐฉํฅ์ด ์๋๋๋ผ๋ ๊ทธ ๊ณผ์ ์์์ ์ฐ๋ฆฌ๋ ๋ฐฐ์ฐ๊ณ ์ฑ์ฅํฉ๋๋ค. ์คํจ๋ ๋์ด ์๋๋ผ ๋ฐฐ์์ ์์์
๋๋ค. ์ฑ๊ณตํ ์ฌ๋๋ค์ ์ด์ผ๊ธฐ๋ฅผ ๋ค์ด๋ณด๋ฉด ๊ทธ๋ค ์ญ์ ์๋ง์ ์คํจ๋ฅผ ๊ฒฝํํ์ต๋๋ค. ์ค์ํ ๊ฒ์ ๋์ด์ง์ง ์๋ ๊ฒ์ด ์๋๋ผ ๋์ด์ก์ ๋ ๋ค์ ์ผ์ด๋๋ ์ฉ๊ธฐ์
๋๋ค. ๋์ ์ ์๋ฒฝํ ์ค๋น๊ฐ ๋์์ ๋ ์์ํ๋ ๊ฒ์ด ์๋๋ผ ๋ถ์กฑํจ์ ์ธ์ ํ๋ฉด์๋ ํ ๊ฑธ์์ ๋ด๋๋ ์๊ฐ ์์๋ฉ๋๋ค. ๋ํ ๋์ ์ ์ฐ๋ฆฌ์๊ฒ ์์ ๊ฐ์ ์ค๋๋ค. ์์ ๋์ ์ ํ๋์ฉ ์ด๋ฃจ์ด ๊ฐ ๋๋ง๋ค ์ฐ๋ฆฌ๋ ๋๋ ํ ์ ์๋ค๋ ๋ฏฟ์์ ์ป๊ฒ ๋ฉ๋๋ค. ๊ทธ ๋ฏฟ์์ ๋ ๋ค๋ฅธ ๋์ ์ผ๋ก ์ด์ด์ง๊ณ ๊ฒฐ๊ตญ ์ฐ๋ฆฌ์ ์ถ์ ๋ ๋๊ณ ๊น๊ฒ ๋ง๋ค์ด์ค๋๋ค. ๋ฌผ๋ก ๋์ ์๋ ๋๋ ค์์ด ๋ฐ๋ฆ
๋๋ค. ๊ทธ๋ฌ๋ ๋๋ ค์์ด ์๋ค๋ ๊ฒ์ ๊ทธ ์ผ์ด ์ฐ๋ฆฌ์๊ฒ ์ค์ํ๋ค๋ ๋ป์ด๊ธฐ๋ ํฉ๋๋ค. ๋๋ ค์์ ํผํ๊ธฐ๋ณด๋ค ๋ง์ฃผํ ๋ ์ฐ๋ฆฌ๋ ์ด์ ๋ณด๋ค ๋ ๊ฐํด์ง๋๋ค. ๊ทธ๋ฆฌ๊ณ ๊ทธ ๊ฒฝํ์ ์ฐ๋ฆฌ์ ์ธ์์์ ์์คํ ์์ฐ์ด ๋ฉ๋๋ค. ์ฌ๋ฌ๋ถ, ์ง๊ธ ๋ง์์์ ๋ง์ค์ด๊ณ ์๋ ์ผ์ด ์๋ค๋ฉด ํ๋ฒ ์ฉ๊ธฐ๋ฅผ ๋ด๋ณด์ญ์์ค. ๊ฑฐ์ฐฝํ ๋ชฉํ๊ฐ ์๋์ด๋ ๊ด์ฐฎ์ต๋๋ค. ์๋ก์ด ์ทจ๋ฏธ๋ฅผ ์์ํ๋ ๊ฒ, ์๋ก์ด ์ฌ๋์๊ฒ ๋จผ์ ์ธ์ฌํ๋ ๊ฒ, ์๋ก์ด ๊ณต๋ถ๋ฅผ ์์ํ๋ ๊ฒ, all of these are also challenges. ์ค์ํ ๊ฒ์ ํฌ๊ธฐ๊ฐ ์๋๋ผ ์๋ํ๋ ๋ง์์
๋๋ค. ์ฐ๋ฆฌ์ ์ธ์์ ํ ๋ฒ๋ฟ์
๋๋ค. ์์ ํ ๊ธธ๋ง์ ์ ํํ๊ธฐ์๋ ๋๋ฌด๋ ์์คํ ์๊ฐ์
๋๋ค. ๋๋ก๋ ์คํจํ๋๋ผ๋ ๋์ ํ๋ ๊ธฐ์ต์ ํํ๋ณด๋ค ๋ ๊ฐ์ง ๊ฒฝํ์ผ๋ก ๋จ์ต๋๋ค. ๋ง์ง๋ง์ผ๋ก ์ฌ๋ฌ๋ถ๊ป ๋ง์๋๋ฆฌ๊ณ ์ถ์ต๋๋ค. ๋์ ์ ํน๋ณํ ์ฌ๋๋ง์ ๊ถ๋ฆฌ๊ฐ ์๋๋๋ค. ๋ฐ๋ก ์ง๊ธ ์ด ์๋ฆฌ์ ์๋ ์ฐ๋ฆฌ ๋ชจ๋์ ๊ถ๋ฆฌ์ด๋ฉฐ ๊ฐ๋ฅ์ฑ์
๋๋ค. ์ค๋ ์์ ํ ๊ฑธ์์ ๋ด๋๋๋ค๋ฉด ๋ด์ผ์ ๋ถ๋ช
ํ ๋ฌ๋ผ์ง ๊ฒ์
๋๋ค. ๊ฒฝ์ฒญํด ์ฃผ์
์ ๊ฐ์ฌํฉ๋๋ค.
arabic.mp3 (Arabic)
ุฎุทุงุจ ุนู ุงูุฃู
ู ูุงูู
ุณุชูุจู ุงูุณูุงู
ุนูููู
ูุฑุญู
ุฉ ุงููู ูุจุฑูุงุชู ุฃููุง ุงูุญุถูุฑ ุงููุฑูู
ูุณุนุฏูู ุฃู ุฃูู ุฃู
ุงู
ูู
ุงูููู
ูุฃุชุญุฏุซ ุนู ู
ูุถูุน ู
ูู
ูู ุญูุงุชูุง ุฌู
ูุนุงู ููู ุงูุฃู
ู ูุงูู
ุณุชูุจู. ุฅู ุงูุฃู
ู ูู ุงูููุฑ ุงูุฐู ูุถูุก ุทุฑูููุง ูู ุฃููุงุช ุงูุธูุงู
ููู ุงูููุฉ ุงูุชู ุชุฏูุนูุง ุฅูู ุงูุงุณุชู
ุฑุงุฑ ุฑุบู
ุงูุตุนูุจุงุช ูุงูุชุญุฏูุงุช. ูุญู ูุนูุด ูู ุนุงูู
ู
ููุก ุจุงูุชุบูุฑุงุช ุงูุณุฑูุนุฉ ูู ููู
ููุงุฌู ุฃุฎุจุงุฑ ุฌุฏูุฏุฉ ูุชุญุฏูุงุช ู
ุฎุชููุฉ ูุฑุจู
ุง ุฃุญูุงูุง ูุดุนุฑ ุจุงูุฎูู ุฃู ุงูููู ู
ู ุงูู
ุณุชูุจู ูููู ู
ูู
ุง ูุงูุช ุงูุธุฑูู ูุจูู ุงูุฃู
ู ูู ุงูุณูุงุญ ุงูุฃููู ุงูุฐู ูู
ููู ูุจุฏูู ุงูุฃู
ู ูููุฏ ุงูุฑุบุจุฉ ูู ุงูุนู
ู ููููุฏ ุงูุฅูู
ุงู ุจูุฏุฑุชูุง ุนูู ุงูุชุบููุฑ ุฅู ุงูู
ุณุชูุจู ูุง ูุจูู ุจุงูุฃุญูุงู
ูุญุฏูุง ุจู ูุจูู ุจุงูุนู
ู ูุงูุงุฌุชูุงุฏ ูุงูุฅุตุฑุงุฑ ุนูุฏู
ุง ูุคู
ู ุจุฃููุณูุง ููุณุนู ูุชุทููุฑ ู
ูุงุฑุงุชูุง ููุชุนูู
ู
ู ุฃุฎุทุงุฆูุง ูุฅููุง ูุถุน ุฃุณุงุณุง ูููุง ูู
ุณุชูุจู ุฃูุถู ูู ุฅูุฌุงุฒ ุนุธูู
ุจุฏุฃ ุจููุฑุฉ ุตุบูุฑุฉ ููู ูุฌุงุญ ูุจูุฑ ูุงู ูุชูุฌุฉ ุฎุทูุงุช ู
ุชูุงุถุนุฉ ูููู ุซุงุจุชุฉ ุงูุฃู
ู ูุง ูุนูู ุชุฌุงูู ุงููุงูุน ุฃู ุฅููุงุฑ ุงูุตุนูุจุงุช ุจู ูุนูู ุงููุธุฑ ุฅูู ุงูุชุญุฏูุงุช ููุฑุต ูููู
ู ูุงูุชุนูู
ูุนูุฏู
ุง ููุงุฌู ุงููุดู ูุชุนูู
ุงูุตุจุฑ ูุนูุฏู
ุง ููุงุฌู ุงูุนูุจุงุช ููุชุณุจ ุงูููุฉ ูููุฐุง ูุตุจุญ ุฃูุซุฑ ุงุณุชุนุฏุงุฏุงู ูุตูุงุนุฉ ู
ุณุชูุจู ู
ุดุฑู ูู
ุง ุฃู ููุฃู
ู ุฏูุฑุงู ู
ูู
ุงู ูู ุจูุงุก ุงูู
ุฌุชู
ุนุงุช ุนูุฏู
ุง ููุชุดุฑ ุงูุชูุงุคู ุจูู ุงููุงุณ ูุฒุฏุงุฏ ุงูุชุนุงูู ูุงูุชุถุงู
ู ูุนูุฏู
ุง ูุนู
ู ู
ุนุง ุจุฑูุญ ุฅูุฌุงุจูุฉ ูุณุชุทูุน ุฃู ูุญูู ุฅูุฌุงุฒุงุช ุนุธูู
ุฉ ุชููุฏ ุงูุฌู
ูุน ูุงูู
ุณุชูุจู ููุณ ู
ุณุคูููุฉ ูุฑุฏ ูุงุญุฏ ุจู ูู ู
ุณุคูููุฉ ู
ุดุชุฑูุฉ ุจูููุง ุฌู
ูุนุง ุฃููุง ุงูุญุถูุฑ ุงููุฑูู
ููุญูู
ูุนู
ูููู ููุนู
ู ุฃูุถุง ููุซู ุจูุฏุฑุงุชูุง ูููุณุงุนุฏ ุจุนุถูุง ุงูุจุนุถ ููุฌุนู ู
ู ุงูุฃู
ู ุฃุณููุจ ุญูุงุฉ ูุง ู
ุฌุฑุฏ ููู
ุฉ ูุฑุฏุฏูุง ููู ููู
ุฌุฏูุฏ ูู ูุฑุตุฉ ุฌุฏูุฏุฉ ููู ูุญุธุฉ ูู ุจุฏุงูุฉ ู
ุญุชู
ูุฉ ููุฌุงุญ ูุงุฏู
ููู ุงูุฎุชุงู
ุชุฐูุฑูุง ุฃู ุงูู
ุณุชูุจู ูุตูุน ุงูููู
ูุฃู ุงูุฃู
ู ูู ุงูุจุฐุฑุฉ ุงูุชู ุฅุฐุง ุฒุฑุนูุงูุง ุจุงูุนุฒูู
ุฉ ูุงูุนู
ู ุฃุซู
ุฑุช ูุฌุงุญุง ูุณุนุงุฏุฉ ุดูุฑุง ูุญุณู ุงุณุชู
ุงุนูู
ูุงูุณูุงู
ุนูููู
ูุฑุญู
ุฉ ุงููู ูุจุฑูุงุชู
All 14 files (Euskara, Guarani, Yoruba, Afrikaans, Bengali, Chinese, isiZulu, Kiswahili, Russian, etc.): full plain-text transcriptions are in eval_logs/whisperkit_aoiandroid_test_2026-03-17T13-41-58.186Z.log in this repo.
Quality notes
- English: Stable long-form narration.
- Japanese: High accuracy on news and narrative (test.mp3, ๆฅๆฌ่ช.mp3).
- Korean, Chinese, Arabic, Russian: Consistent recognition on long content.
- Multilingual: Many segments reported as [en] by the model while source language was correctly transcribed.
- Bengali: Some mixed script/errors.
Reproduce
cd TranslateBluePackage
WHISPERKIT_TEST_AUDIO_DIR=/path/to/input/audio \
WHISPERKIT_TEST_LOG_DIR=/path/to/Log \
swift test --filter WhisperKitAOIAndroidModelTests
(Use WhisperKitConfig(model: "openai_whisper-large-v3-v20240930_turbo_632MB", modelRepo: "aoiandroid/whisperkit-coreml") in your Swift code.)
Full transcription log: see the file under eval_logs/ in this repo (e.g. whisperkit_aoiandroid_test_2026-03-17T13-41-58.186Z.log).
- Downloads last month
- 564