all AI news
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control. (arXiv:2106.11171v2 [eess.AS] UPDATED)
Jan. 13, 2022, 2:10 a.m. | Minsu Kang, Sungjae Kim, Injung Kim
cs.LG updates on arXiv.org arxiv.org
We propose a novel high-fidelity expressive speech synthesis model, UniTTS,
that learns and controls overlapping style attributes avoiding interference.
UniTTS represents multiple style attributes in a single unified embedding space
by the residuals between the phoneme embeddings before and after applying the
attributes. The proposed method is especially effective in controlling multiple
attributes that are difficult to separate cleanly, such as speaker ID and
emotion, because it minimizes redundancy when adding variance in speaker ID and
emotion, and additionally, predicts …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Healthcare Data Modeler/Data Architect - REMOTE
@ Perficient | United States
Data Analyst – Sustainability, Green IT
@ H&M Group | Stockholm, Sweden
RWE Data Analyst
@ Sanofi | Hyderabad
Machine Learning Engineer
@ JPMorgan Chase & Co. | Jersey City, NJ, United States