Oct. 18, 2022, 1:13 a.m. | Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Saeed Sarfjoo, Petr Motlicek, Matthias Kleinert, Hartmut Helmke, Oliver Ohneiser, Qingran Zhan

cs.CL updates on arXiv.org arxiv.org

Recent work on self-supervised pre-training focus on leveraging large-scale
unlabeled speech data to build robust end-to-end (E2E) acoustic models (AM)
that can be later fine-tuned on downstream tasks e.g., automatic speech
recognition (ASR). Yet, few works investigated the impact on performance when
the data properties substantially differ between the pre-training and
fine-tuning phases, termed domain shift. We target this scenario by analyzing
the robustness of Wav2Vec 2.0 and XLS-R models on downstream ASR for a
completely unseen domain, air traffic …

air traffic arxiv asr benchmark communications traffic

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York