Aug. 10, 2023, 4:44 a.m. | Shivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter

cs.LG updates on arXiv.org arxiv.org

With read-aloud speech synthesis achieving high naturalness scores, there is
a growing research interest in synthesising spontaneous speech. However, human
spontaneous face-to-face conversation has both spoken and non-verbal aspects
(here, co-speech gestures). Only recently has research begun to explore the
benefits of jointly synthesising these two modalities in a single system. The
previous state of the art used non-probabilistic methods, which fail to capture
the variability of human speech and motion, and risk producing oversmoothing
artefacts and sub-optimal synthesis quality. …

arxiv benefits conversation denoising diff explore face human research speech spoken synthesis verbal

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US