Oct. 27, 2022, 1:15 a.m. | Piyush Behre, Naveen Parihar, Sharman Tan, Amy Shah, Eva Sharma, Geoffrey Liu, Shuangyu Chang, Hosam Khalil, Chris Basoglu, Sayan Pathak

cs.CL updates on arXiv.org arxiv.org

Segmentation for continuous Automatic Speech Recognition (ASR) has
traditionally used silence timeouts or voice activity detectors (VADs), which
are both limited to acoustic features. This segmentation is often overly
aggressive, given that people naturally pause to think as they speak.
Consequently, segmentation happens mid-sentence, hindering both punctuation and
downstream tasks like machine translation for which high-quality segmentation
is critical. Model-based segmentation methods that leverage acoustic features
are powerful, but without an understanding of the language itself, these
approaches are limited. …

arxiv features segmentation smart speech

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst

@ SEAKR Engineering | Englewood, CO, United States

Data Analyst II

@ Postman | Bengaluru, India

Data Architect

@ FORSEVEN | Warwick, GB

Director, Data Science

@ Visa | Washington, DC, United States

Senior Manager, Data Science - Emerging ML

@ Capital One | McLean, VA