Web: http://arxiv.org/abs/2201.10240

Jan. 26, 2022, 2:10 a.m. | Chao Zhang, Bo Li, Zhiyun Lu, Tara N. Sainath, Shuo-yiin Chang

cs.CL updates on arXiv.org arxiv.org

The recurrent neural network transducer (RNN-T) has recently become the
mainstream end-to-end approach for streaming automatic speech recognition
(ASR). To estimate the output distributions over subword units, RNN-T uses a
fully connected layer as the joint network to fuse the acoustic representations
extracted using the acoustic encoder with the text representations obtained
using the prediction network based on the previous subword units. In this
paper, we propose to use gating, bilinear pooling, and a combination of them in
the joint …

arxiv rnn text

More from arxiv.org / cs.CL updates on arXiv.org

Data Analytics and Technical support Lead

@ Coupa Software, Inc. | Bogota, Colombia

Data Science Manager

@ Vectra | San Jose, CA

Data Analyst Sr

@ Capco | Brazil - Sao Paulo

Data Scientist (NLP)

@ Builder.ai | London, England, United Kingdom - Remote

Senior Data Analyst

@ BuildZoom | Scottsdale, AZ/ San Francisco, CA/ Remote

Senior Research Scientist, Speech Recognition

@ SoundHound Inc. | Toronto, Canada