all AI news
Microsoft’s NaturalSpeech 2 Outperforms Previous TTS Systems in Zero-Shot Speech and Singing Synthesis
Synced syncedreview.com
In the new paper NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers, a Microsoft team introduces NaturalSpeech 2, a TTS system with latent diffusion models for natural and strong zero-shot voice synthesis that captures expressive prosodies with superior robustness.
The post Microsoft’s NaturalSpeech 2 Outperforms Previous TTS Systems in Zero-Shot Speech and Singing Synthesis first appeared on Synced.
ai artificial intelligence deep-neural-networks diffusion diffusion models machine learning machine learning & data science microsoft ml natural paper research robustness speech speech-synthesis synthesis systems team technology text-to-speech tts voice voice synthesis zero shot learning