June 7, 2022, 1:12 a.m. | Ziyue Jiang, Su Zhe, Zhou Zhao, Qian Yang, Yi Ren, Jinglin Liu, Zhenhui Ye

cs.CL updates on arXiv.org arxiv.org

Polyphone disambiguation aims to capture accurate pronunciation knowledge
from natural text sequences for reliable Text-to-speech (TTS) systems. However,
previous approaches require substantial annotated training data and additional
efforts from language experts, making it difficult to extend high-quality
neural TTS systems to out-of-domain daily conversations and countless languages
worldwide. This paper tackles the polyphone disambiguation problem from a
concise and novel perspective: we propose Dict-TTS, a semantic-aware generative
text-to-speech model with an online website dictionary (the existing prior
information in the …

arxiv dictionary knowledge learning prior speech text text-to-speech tts

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior AI & Data Engineer

@ Bertelsmann | Kuala Lumpur, 14, MY, 50400

Analytics Engineer

@ Reverse Tech | Philippines - Remote