Feb. 11, 2022, 2:11 a.m. | Manuel Sam Ribeiro, Julian Roth, Giulia Comini, Goeric Huybrechts, Adam Gabrys, Jaime Lorenzo-Trueba

cs.LG updates on arXiv.org arxiv.org

We address the problem of cross-speaker style transfer for text-to-speech
(TTS) using data augmentation via voice conversion. We assume to have a corpus
of neutral non-expressive data from a target speaker and supporting
conversational expressive data from different speakers. Our goal is to build a
TTS system that is expressive, while retaining the target speaker's identity.
The proposed approach relies on voice conversion to first generate high-quality
data from the set of supporting expressive speakers. The voice converted data
is …

arxiv augmentation data speech style transfer text text-to-speech

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Technology Consultant Master Data Management (w/m/d)

@ SAP | Walldorf, DE, 69190

Research Engineer, Computer Vision, Google Research

@ Google | Nairobi, Kenya