Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody. (arXiv:2206.14643v1 [eess.AS]) | allainews.com

June 30, 2022, 1:12 a.m. | Peter Makarov, Ammar Abbas, Mateusz Łajszczak, Arnaud Joly, Sri Karlapati, Alexis Moinet, Thomas Drugman, Penny Karanasou

cs.CL updates on arXiv.org arxiv.org

Generating expressive and contextually appropriate prosody remains a
challenge for modern text-to-speech (TTS) systems. This is particularly evident
for long, multi-sentence inputs. In this paper, we examine simple extensions to
a Transformer-based FastSpeech-like system, with the goal of improving prosody
for multi-sentence TTS. We find that long context, powerful text features, and
training on multi-speaker data all improve prosody. More interestingly, they
result in synergies. Long context disambiguates prosody, improves coherence,
and plays to the strengths of Transformers. Fine-tuning word-level …

More from arxiv.org / cs.CL updates on arXiv.org

Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition an hour ago | arxiv.org

abstract artificial artificial general intelligence arxiv +19

Visually grounded few-shot word learning in low-resource settings an hour ago | arxiv.org

abstract arxiv cs.cl eess.as +16

KTRL+F: Knowledge-Augmented In-Document Search an hour ago | arxiv.org

abstract arxiv challenges cs.cl +12

Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering an hour ago | arxiv.org

abstract alignment applications arxiv +19

Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks an hour ago | arxiv.org

abstract arxiv context cs.cl +17

LibriSQA: A Novel Dataset and Framework for Spoken Question Answering with Large Language Models an hour ago | arxiv.org

arxiv cs.cl dataset framework +9

Efficient Sentiment Analysis: A Resource-Aware Evaluation of Feature Extraction Techniques, Ensembling, and Deep Learning Models an hour ago | arxiv.org

abstract accuracy analysis arxiv +18

Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement an hour ago | arxiv.org

arxiv cs.ai cs.cl language +6

MFE-NER: Multi-feature Fusion Embedding for Chinese Named Entity Recognition an hour ago | arxiv.org

abstract arxiv characters chinese +10

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

AI Scientist/Engineer

@ OKX | Singapore

View on ai-jobs.net

Research Engineering/ Scientist Associate I

@ The University of Texas at Austin | AUSTIN, TX

View on ai-jobs.net

Senior Data Engineer

@ Algolia | London, England

View on ai-jobs.net

Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)

@ BlackRock | NY7 - 50 Hudson Yards, New York

View on ai-jobs.net

Snowflake Data Analytics

@ Devoteam | Madrid, Spain

View on ai-jobs.net