April 27, 2023, 3 a.m. | Synced

Synced syncedreview.com

In the new paper NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers, a Microsoft team introduces NaturalSpeech 2, a TTS system with latent diffusion models for natural and strong zero-shot voice synthesis that captures expressive prosodies with superior robustness.


The post Microsoft’s NaturalSpeech 2 Outperforms Previous TTS Systems in Zero-Shot Speech and Singing Synthesis first appeared on Synced.

ai artificial intelligence deep-neural-networks diffusion diffusion models machine learning machine learning & data science microsoft ml natural paper research robustness speech speech-synthesis synthesis systems team technology text-to-speech tts voice voice synthesis zero shot learning

More from syncedreview.com / Synced

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst (H/F)

@ Business & Decision | Montpellier, France

Machine Learning Researcher

@ VERSES | Brighton, England, United Kingdom - Remote