Web: http://arxiv.org/abs/2201.11972

Jan. 31, 2022, 2:10 a.m. | Songxiang Liu, Dan Su, Dong Yu

cs.CL updates on arXiv.org arxiv.org

Denoising diffusion probabilistic models (DDPMs) are expressive generative
models that have been used to solve a variety of speech synthesis problems.
However, because of their high sampling costs, DDPMs are difficult to use in
real-time speech processing applications. In this paper, we introduce
DiffGAN-TTS, a novel DDPM-based text-to-speech (TTS) model achieving
high-fidelity and efficient speech synthesis. DiffGAN-TTS is based on denoising
diffusion generative adversarial networks (GANs), which adopt an
adversarially-trained expressive model to approximate the denoising
distribution. We show with …

arxiv gans speech text text-to-speech

More from arxiv.org / cs.CL updates on arXiv.org

Director, Data Engineering and Architecture

@ Chainalysis | California | New York | Washington DC | Remote - USA

Deep Learning Researcher

@ Topaz Labs | Dallas, TX

Sr Data Engineer (Contractor)

@ SADA | US - West

Senior Cloud Database Administrator

@ Findhelp | Remote

Senior Data Analyst

@ System1 | Remote

Speech Machine Learning Research Engineer

@ Samsung Research America | Mountain View, CA