June 1, 2022, 1:12 a.m. | Yinghao Aaron Li, Cong Han, Nima Mesgarani

cs.CL updates on arXiv.org arxiv.org

Text-to-Speech (TTS) has recently seen great progress in synthesizing
high-quality speech owing to the rapid development of parallel TTS systems, but
producing speech with naturalistic prosodic variations, speaking styles and
emotional tones remains challenging. Moreover, since duration and speech are
generated separately, parallel TTS models still have problems finding the best
monotonic alignments that are crucial for naturalistic speech synthesis. Here,
we propose StyleTTS, a style-based generative model for parallel TTS that can
synthesize diverse speech with natural prosody from …

arxiv natural speech text text-to-speech

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Manager, Data Management & Insights Asia

@ Swiss Re | Bengaluru, KA, IN

Data Science Co-op

@ Authenticate | United States - Remote

Intern 2024 - Data Engineer, Smart MFG & AI

@ Micron Technology | Taoyuan - Fab 11, Taiwan

Data Engineer

@ Nine | Sydney, Australia