April 7, 2022, 1:11 a.m. | Paarth Neekhara, Jason Li, Boris Ginsburg

cs.CL updates on arXiv.org arxiv.org

Training neural text-to-speech (TTS) models for a new speaker typically
requires several hours of high quality speech data. Prior works on voice
cloning attempt to address this challenge by adapting pre-trained multi-speaker
TTS models for a new voice, using a few minutes of speech data of the new
speaker. However, publicly available large multi-speaker datasets are often
noisy, thereby resulting in TTS models that are not suitable for use in
products. We address this challenge by proposing transfer-learning guidelines
for …

arxiv learning transfer learning tts

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Machine Learning Engineer (m/f/d)

@ StepStone Group | Düsseldorf, Germany

2024 GDIA AI/ML Scientist - Supplemental

@ Ford Motor Company | United States