all AI news
ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech. (arXiv:2202.07816v1 [eess.AS])
Feb. 17, 2022, 8:10 a.m. | Yi Ren, Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Zhou Zhao
cs.CL updates on arXiv.org arxiv.org
Expressive text-to-speech (TTS) has become a hot research topic recently,
mainly focusing on modeling prosody in speech. Prosody modeling has several
challenges: 1) the extracted pitch used in previous prosody modeling works have
inevitable errors, which hurts the prosody modeling; 2) different attributes of
prosody (e.g., pitch, duration and energy) are dependent on each other and
produce the natural prosody together; and 3) due to high variability of prosody
and the limited amount of high-quality data for TTS training, the …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Scientist (m/f/x/d)
@ Symanto Research GmbH & Co. KG | Spain, Germany
Enterprise Data Architect
@ Pathward | Remote
Diagnostic Imaging Information Systems (DIIS) Technologist
@ Nova Scotia Health Authority | Halifax, NS, CA, B3K 6R8
Intern Data Scientist - Residual Value Risk Management (f/m/d)
@ BMW Group | Munich, DE
Analytics Engineering Manager
@ PlayStation Global | United Kingdom, London
Junior Insight Analyst (PR&Comms)
@ Signal AI | Lisbon, Lisbon, Portugal