Web: http://arxiv.org/abs/2203.15206

June 16, 2022, 1:12 a.m. | Fangyuan Wang, Bo Xu

cs.CL updates on arXiv.org arxiv.org

Currently, there are mainly three Transformer encoder based streaming End to
End (E2E) Automatic Speech Recognition (ASR) approaches, namely time-restricted
methods, chunk-wise methods, and memory based methods. However, all of them
have some limitations in aspects of global context modeling, linear
computational complexity, and model parallelism. In this work, we aim to build
a single model to achieve the benefits of all the three aspects for streaming
E2E ASR. Particularly, we propose to use a shifted chunk mechanism instead of …

arxiv asr encoder streaming transformer

More from arxiv.org / cs.CL updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY