April 20, 2023, 3:30 p.m. | Maximilian Schreiner

THE DECODER the-decoder.com


Microsofts NaturalSpeech 2 is a text-to-speech model that is based on diffusion models and clones any voice with a few seconds of audio.


The article NaturalSpeech 2: Microsoft edges closer to zero-shot voice cloning appeared first on THE DECODER.

ai and language ai research article artificial intelligence audio cloning decoder diffusion diffusion models generative-ai microsoft speech text text-to-speech voice voice cloning

More from the-decoder.com / THE DECODER

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

IT Commercial Data Analyst - ESO

@ National Grid | Warwick, GB, CV34 6DA

Stagiaire Data Analyst – Banque Privée - Juillet 2024

@ Rothschild & Co | Paris (Messine-29)

Operations Research Scientist I - Network Optimization Focus

@ CSX | Jacksonville, FL, United States

Machine Learning Operations Engineer

@ Intellectsoft | Baku, Baku, Azerbaijan - Remote

Data Analyst

@ Health Care Service Corporation | Richardson Texas HQ (1001 E. Lookout Drive)