March 14, 2024, 4:42 a.m. | Ziqi Liang, Haoxiang Shi, Jiawei Wang, Keda Lu

cs.LG updates on

arXiv:2403.08164v1 Announce Type: cross
Abstract: Recently, deep learning-based Text-to-Speech (TTS) systems have achieved high-quality speech synthesis results. Recurrent neural networks have become a standard modeling technique for sequential data in TTS systems and are widely used. However, training a TTS model which includes RNN components requires powerful GPU performance and takes a long time. In contrast, CNN-based sequence synthesis techniques can significantly reduce the parameters and training time of a TTS model while guaranteeing a certain performance due to their …

abstract arxiv become components cs.lg data deep learning gpu however low modeling networks neural networks performance quality recurrent neural networks results rnn speech standard synthesis systems text text-to-speech training tts type

Senior Data Engineer

@ Displate | Warsaw

Principal Software Engineer

@ Microsoft | Prague, Prague, Czech Republic

Sr. Global Reg. Affairs Manager

@ BASF | Research Triangle Park, NC, US, 27709-3528

Senior Robot Software Developer

@ OTTO Motors by Rockwell Automation | Kitchener, Ontario, Canada

Coop - Technical Service Hub Intern

@ Teradyne | Santiago de Queretaro, MX

Coop - Technical - Service Inside Sales Intern

@ Teradyne | Santiago de Queretaro, MX