March 5, 2024, 2:45 p.m. | Antoine Nzeyimana

cs.LG updates on arXiv.org arxiv.org

arXiv:2308.11863v3 Announce Type: replace-cross
Abstract: Despite recent availability of large transcribed Kinyarwanda speech data, achieving robust speech recognition for Kinyarwanda is still challenging. In this work, we show that using self-supervised pre-training, following a simple curriculum schedule during fine-tuning and using semi-supervised learning to leverage large unlabelled speech data significantly improve speech recognition performance for Kinyarwanda. Our approach focuses on using public domain data only. A new studio-quality speech dataset is collected from a public website, then used to train …

abstract arxiv availability cs.lg cs.sd curriculum data eess.as fine-tuning pre-training recognition robust semi-supervised semi-supervised learning show simple speech speech recognition supervised learning training type via work

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

DevOps Engineer (Data Team)

@ Reward Gateway | Sofia/Plovdiv