Jan. 14, 2022, 2:10 a.m. | Marcely Zanon Boito, Fethi Bougares, Florentin Barbier, Souhir Gahbiche, Loïc Barrault, Mickael Rouvier, Yannick Estève

cs.CL updates on arXiv.org arxiv.org

In this paper we present two datasets for Tamasheq, a developing language
mainly spoken in Mali and Niger. These two datasets were made available for the
IWSLT 2022 low-resource speech translation track, and they consist of
collections of radio recordings from the Studio Kalangou (Niger) and Studio
Tamani (Mali) daily broadcast news. We share (i) a massive amount of unlabeled
audio data (671 hours) in five languages: French from Niger, Fulfulde, Hausa,
Tamasheq and Zarma, and (ii) a smaller parallel …

arxiv language speech

Senior Data Engineer

@ Publicis Groupe | New York City, United States

Associate Principal Robotics Engineer - Research.

@ Dyson | United Kingdom - Hullavington Office

Duales Studium mit vertiefter Praxis: Bachelor of Science Künstliche Intelligenz und Data Science (m/w/d)

@ Gerresheimer | Wackersdorf, Germany

AI/ML Engineer (TS/SCI) {S}

@ ARKA Group, LP | Aurora, Colorado, United States

Data Integration Engineer

@ Find.co | Sliema

Data Engineer

@ Q2 | Bengaluru, India