all AI news
Cross-speaker style transfer for text-to-speech using data augmentation. (arXiv:2202.05083v1 [eess.AS])
Feb. 11, 2022, 2:11 a.m. | Manuel Sam Ribeiro, Julian Roth, Giulia Comini, Goeric Huybrechts, Adam Gabrys, Jaime Lorenzo-Trueba
cs.LG updates on arXiv.org arxiv.org
We address the problem of cross-speaker style transfer for text-to-speech
(TTS) using data augmentation via voice conversion. We assume to have a corpus
of neutral non-expressive data from a target speaker and supporting
conversational expressive data from different speakers. Our goal is to build a
TTS system that is expressive, while retaining the target speaker's identity.
The proposed approach relies on voice conversion to first generate high-quality
data from the set of supporting expressive speakers. The voice converted data
is …
arxiv augmentation data speech style transfer text text-to-speech
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Technology Consultant Master Data Management (w/m/d)
@ SAP | Walldorf, DE, 69190
Research Engineer, Computer Vision, Google Research
@ Google | Nairobi, Kenya