Aug. 1, 2022, 1:11 a.m. | Peng Shen, Xugang Lu, Hisashi Kawai

cs.CL updates on arXiv.org arxiv.org

For Mandarin end-to-end (E2E) automatic speech recognition (ASR) tasks,
compared to character-based modeling units, pronunciation-based modeling units
could improve the sharing of modeling units in model training but meet
homophone problems. In this study, we propose to use a novel
pronunciation-aware unique character encoding for building E2E RNN-T-based
Mandarin ASR systems. The proposed encoding is a combination of
pronunciation-base syllable and character index (CI). By introducing the CI,
the RNN-T model can overcome the homophone problem while utilizing the
pronunciation …

arxiv encoding rnn speech speech recognition

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst - Associate

@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India

Staff Data Engineer (Data Platform)

@ Coupang | Seoul, South Korea

AI/ML Engineering Research Internship

@ Keysight Technologies | Santa Rosa, CA, United States

Sr. Director, Head of Data Management and Reporting Execution

@ Biogen | Cambridge, MA, United States

Manager, Marketing - Audience Intelligence (Senior Data Analyst)

@ Delivery Hero | Singapore, Singapore