Feb. 17, 2022, 8:11 a.m. | Adam Gabryś, Goeric Huybrechts, Manuel Sam Ribeiro, Chung-Ming Chien, Julian Roth, Giulia Comini, Roberto Barra-Chicote, Bartek Perz, Jaime Lore

cs.LG updates on arXiv.org arxiv.org

State-of-the-art text-to-speech (TTS) systems require several hours of
recorded speech data to generate high-quality synthetic speech. When using
reduced amounts of training data, standard TTS models suffer from speech
quality and intelligibility degradations, making training low-resource TTS
systems problematic. In this paper, we propose a novel extremely low-resource
TTS method called Voice Filter that uses as little as one minute of speech from
a target speaker. It uses voice conversion (VC) as a post-processing module
appended to a pre-existing high-quality …

arxiv conversion processing speech text text-to-speech voice

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst

@ Aviva | UK - Norwich - Carrara - 1st Floor

Werkstudent im Bereich Performance Engineering mit Computer Vision (w/m/div.) - anteilig remote

@ Bosch Group | Stuttgart, Lollar, Germany

Applied Research Scientist - NLP (Senior)

@ Snorkel AI | Hybrid / San Francisco, CA

Associate Principal Engineer, Machine Learning

@ Nagarro | Remote, India