April 24, 2023, 12:48 a.m. | Cal Peyser, Michael Picheny, Kyunghyun Cho, Rohit Prabhavalkar, Ronny Huang, Tara Sainath

cs.CL updates on arXiv.org arxiv.org

Unpaired text and audio injection have emerged as dominant methods for
improving ASR performance in the absence of a large labeled corpus. However,
little guidance exists on deploying these methods to improve production ASR
systems that are trained on very large supervised corpora and with realistic
requirements like a constrained model size and CPU budget, streaming
capability, and a rich lattice for rescoring and for downstream NLU tasks. In
this work, we compare three state-of-the-art semi-supervised methods
encompassing both unpaired …

art arxiv asr audio budget comparison cpu guidance nlu performance production requirements scale semi-supervised semi-supervised learning state streaming supervised learning systems text work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Management Associate

@ EcoVadis | Ebène, Mauritius

Senior Data Engineer

@ Telstra | Telstra ICC Bengaluru