Jan. 13, 2022, 2:10 a.m. | Minsu Kang, Sungjae Kim, Injung Kim

cs.LG updates on arXiv.org arxiv.org

We propose a novel high-fidelity expressive speech synthesis model, UniTTS,
that learns and controls overlapping style attributes avoiding interference.
UniTTS represents multiple style attributes in a single unified embedding space
by the residuals between the phoneme embeddings before and after applying the
attributes. The proposed method is especially effective in controlling multiple
attributes that are difficult to separate cleanly, such as speaker ID and
emotion, because it minimizes redundancy when adding variance in speaker ID and
emotion, and additionally, predicts …

arxiv embedding learning space speech

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Healthcare Data Modeler/Data Architect - REMOTE

@ Perficient | United States

Data Analyst – Sustainability, Green IT

@ H&M Group | Stockholm, Sweden

RWE Data Analyst

@ Sanofi | Hyderabad

Machine Learning Engineer

@ JPMorgan Chase & Co. | Jersey City, NJ, United States