all AI news
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts
March 13, 2024, 4:47 a.m. | Thomas Palmeira Ferraz, Marcely Zanon Boito, Caroline Brun, Vassilina Nikoulina
cs.CL updates on arXiv.org arxiv.org
Abstract: Whisper is a multitask and multilingual speech model covering 99 languages. It yields commendable automatic speech recognition (ASR) results in a subset of its covered languages, but the model still underperforms on a non-negligible number of under-represented languages, a problem exacerbated in smaller model versions. In this work, we propose DistilWhisper, an approach able to bridge the performance gap in ASR for these languages while retaining the advantages of multitask and multilingual capabilities. Our approach …
abstract arxiv asr automatic speech recognition cs.cl cs.sd distillation eess.as experts language languages multilingual recognition results speech speech recognition type via whisper
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Business Data Analyst
@ Alstom | Johannesburg, GT, ZA