March 5, 2024, 2:45 p.m. | Goeric Huybrechts, Srikanth Ronanki, Xilai Li, Hadis Nosrati, Sravan Bodapati, Katrin Kirchhoff

cs.LG updates on arXiv.org arxiv.org

arXiv:2306.08175v2 Announce Type: replace-cross
Abstract: Conformer-based end-to-end models have become ubiquitous these days and are commonly used in both streaming and non-streaming automatic speech recognition (ASR). Techniques like dual-mode and dynamic chunk training helped unify streaming and non-streaming systems. However, there remains a performance gap between streaming with a full and limited past context. To address this issue, we propose the integration of a novel dynamic contextual carry-over mechanism in a state-of-the-art (SOTA) unified ASR system. Our proposed dynamic context …

abstract arxiv asr automatic speech recognition become context cs.ai cs.lg cs.sd dynamic eess.as gap latency low low latency performance recognition speech speech recognition streaming systems training type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Scientist, gTech Ads

@ Google | Mexico City, CDMX, Mexico

Lead, Data Analytics Operations

@ Zocdoc | Pune, Maharashtra, India