June 27, 2022, 1:11 a.m. | Mutian He, Jingzhou Yang, Lei He, Frank K. Soong

cs.CL updates on arXiv.org arxiv.org

End-to-end TTS requires a large amount of speech/text paired data to cover
all necessary knowledge, particularly how to pronounce different words in
diverse contexts, so that a neural model may learn such knowledge accordingly.
But in real applications, such high demand of training data is hard to be
satisfied and additional knowledge often needs to be injected manually. For
example, to capture pronunciation knowledge on languages without regular
orthography, a complicated grapheme-to-phoneme pipeline needs to be built based
on a …

arxiv errors knowledge reduce tts

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Program Control Data Analyst

@ Ford Motor Company | Mexico

Vice President, Business Intelligence / Data & Analytics

@ AlphaSense | Remote - United States