Web: http://arxiv.org/abs/2110.04934

Jan. 27, 2022, 2:11 a.m. | Yiming Wang, Jinyu Li, Heming Wang, Yao Qian, Chengyi Wang, Yu Wu

cs.LG updates on arXiv.org arxiv.org

The goal of self-supervised learning (SSL) for automatic speech recognition
(ASR) is to learn good speech representations from a large amount of unlabeled
speech for the downstream ASR task. However, most SSL frameworks do not
consider noise robustness which is crucial for real-world applications. In this
paper we propose wav2vec-Switch, a method to encode noise robustness into
contextualized representations of speech via contrastive learning.
Specifically, we feed original-noisy speech pairs simultaneously into the
wav2vec 2.0 network. In addition to the …

arxiv learning speech speech recognition

More from arxiv.org / cs.LG updates on arXiv.org

Data Analytics and Technical support Lead

@ Coupa Software, Inc. | Bogota, Colombia

Data Science Manager

@ Vectra | San Jose, CA

Data Analyst Sr

@ Capco | Brazil - Sao Paulo

Data Scientist (NLP)

@ Builder.ai | London, England, United Kingdom - Remote

Senior Data Analyst

@ BuildZoom | Scottsdale, AZ/ San Francisco, CA/ Remote

Senior Research Scientist, Speech Recognition

@ SoundHound Inc. | Toronto, Canada