Nov. 1, 2022, 1:13 a.m. | Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda

cs.LG updates on arXiv.org arxiv.org

Our previous work, the unified source-filter GAN (uSFGAN) vocoder, introduced
a novel architecture based on the source-filter theory into the parallel
waveform generative adversarial network to achieve high voice quality and pitch
controllability. However, the high temporal resolution inputs result in high
computation costs. Although the HiFi-GAN vocoder achieves fast high-fidelity
voice generation thanks to the efficient upsampling-based generator
architecture, the pitch controllability is severely limited. To realize a fast
and pitch-controllable high-fidelity neural vocoder, we introduce the
source-filter theory …

arxiv fidelity gan neural vocoder

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne