Web: http://arxiv.org/abs/2206.08039

June 17, 2022, 1:10 a.m. | Yuto Nishimura, Yuki Saito, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari

cs.LG updates on arXiv.org arxiv.org

We propose an end-to-end empathetic dialogue speech synthesis (DSS) model
that considers both the linguistic and prosodic contexts of dialogue history.
Empathy is the active attempt by humans to get inside the interlocutor in
dialogue, and empathetic DSS is a technology to implement this act in spoken
dialogue systems. Our model is conditioned by the history of linguistic and
prosody features for predicting appropriate dialogue context. As such, it can
be regarded as an extension of the conventional linguistic-feature-based
dialogue …

acoustic modeling arxiv history modeling speech

