Sept. 8, 2022, 1:14 a.m. | Huu-Tien Dang, Thi-Hai-Yen Vuong, Xuan-Hieu Phan

cs.CL updates on arXiv.org arxiv.org

Converting written texts into their spoken forms is an essential problem in
any text-to-speech (TTS) systems. However, building an effective text
normalization solution for a real-world TTS system face two main challenges:
(1) the semantic ambiguity of non-standard words (NSWs), e.g., numbers, dates,
ranges, scores, abbreviations, and (2) transforming NSWs into pronounceable
syllables, such as URL, email address, hashtag, and contact name. In this
paper, we propose a new two-phase normalization approach to deal with these
challenges. First, a model-based …

arxiv detection normalization speech standard text text-to-speech

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York