Web: http://arxiv.org/abs/2201.10375

Jan. 26, 2022, 2:11 a.m. | Artem Gorodetskii, Ivan Ozhiganov

cs.LG updates on arXiv.org arxiv.org

With recent advancements in voice cloning, the performance of speech
synthesis for a target speaker has been rendered similar to the human level.
However, autoregressive voice cloning systems still suffer from text alignment
failures, resulting in an inability to synthesize long sentences. In this work,
we propose a variant of attention-based text-to-speech system that can
reproduce a target voice from a few seconds of reference speech and generalize
to very long utterances as well. The proposed system is based on …

arxiv attention voice

More from arxiv.org / cs.LG updates on arXiv.org

Senior Data Analyst

@ Fanatics Inc | Remote - New York

Data Engineer - Search

@ Cytora | United Kingdom - Remote

Product Manager, Technical - Data Infrastructure and Streaming

@ Nubank | Berlin

Postdoctoral Fellow: ML for autonomous materials discovery

@ Lawrence Berkeley National Lab | Berkeley, CA

Principal Data Scientist

@ Zuora | Remote

Data Engineer

@ Veeva Systems | Pennsylvania - Fort Washington