all AI news
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs. (arXiv:2201.11972v1 [eess.AS])
Jan. 31, 2022, 2:10 a.m. | Songxiang Liu, Dan Su, Dong Yu
cs.CL updates on arXiv.org arxiv.org
Denoising diffusion probabilistic models (DDPMs) are expressive generative
models that have been used to solve a variety of speech synthesis problems.
However, because of their high sampling costs, DDPMs are difficult to use in
real-time speech processing applications. In this paper, we introduce
DiffGAN-TTS, a novel DDPM-based text-to-speech (TTS) model achieving
high-fidelity and efficient speech synthesis. DiffGAN-TTS is based on denoising
diffusion generative adversarial networks (GANs), which adopt an
adversarially-trained expressive model to approximate the denoising
distribution. We show with …
More from arxiv.org / cs.CL updates on arXiv.org
VAL: Interactive Task Learning with GPT Dialog Parsing
1 day, 11 hours ago |
arxiv.org
DBCopilot: Scaling Natural Language Querying to Massive Databases
1 day, 11 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Staff Software Engineer, Generative AI, Google Cloud AI
@ Google | Mountain View, CA, USA; Sunnyvale, CA, USA
Expert Data Sciences
@ Gainwell Technologies | Any city, CO, US, 99999