all AI news
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs. (arXiv:2201.11972v1 [eess.AS])
Web: http://arxiv.org/abs/2201.11972
Jan. 31, 2022, 2:10 a.m. | Songxiang Liu, Dan Su, Dong Yu
cs.CL updates on arXiv.org arxiv.org
Denoising diffusion probabilistic models (DDPMs) are expressive generative
models that have been used to solve a variety of speech synthesis problems.
However, because of their high sampling costs, DDPMs are difficult to use in
real-time speech processing applications. In this paper, we introduce
DiffGAN-TTS, a novel DDPM-based text-to-speech (TTS) model achieving
high-fidelity and efficient speech synthesis. DiffGAN-TTS is based on denoising
diffusion generative adversarial networks (GANs), which adopt an
adversarially-trained expressive model to approximate the denoising
distribution. We show with …
More from arxiv.org / cs.CL updates on arXiv.org
Latest AI/ML/Big Data Jobs
Director, Data Engineering and Architecture
@ Chainalysis | California | New York | Washington DC | Remote - USA
Deep Learning Researcher
@ Topaz Labs | Dallas, TX
Sr Data Engineer (Contractor)
@ SADA | US - West
Senior Cloud Database Administrator
@ Findhelp | Remote
Senior Data Analyst
@ System1 | Remote
Speech Machine Learning Research Engineer
@ Samsung Research America | Mountain View, CA