Aug. 1, 2022, 1:11 a.m. | Sung-Feng Huang, Chyi-Jiunn Lin, Da-Rong Liu, Yi-Chen Chen, Hung-yi Lee

cs.CL updates on arXiv.org arxiv.org

Personalizing a speech synthesis system is a highly desired application,
where the system can generate speech with the user's voice with rare enrolled
recordings. There are two main approaches to build such a system in recent
works: speaker adaptation and speaker encoding. On the one hand, speaker
adaptation methods fine-tune a trained multi-speaker text-to-speech (TTS) model
with few enrolled samples. However, they require at least thousands of
fine-tuning steps for high-quality adaptation, making it hard to apply on
devices. On …

arxiv learning meta meta-learning speech text text-to-speech tts

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior AI & Data Engineer

@ Bertelsmann | Kuala Lumpur, 14, MY, 50400

Analytics Engineer

@ Reverse Tech | Philippines - Remote