all AI news
EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning. (arXiv:2206.07860v1 [cs.SD])
cs.LG updates on arXiv.org arxiv.org
Speech generation and enhancement based on articulatory movements facilitate
communication when the scope of verbal communication is absent, e.g., in
patients who have lost the ability to speak. Although various techniques have
been proposed to this end, electropalatography (EPG), which is a monitoring
technique that records contact between the tongue and hard palate during
speech, has not been adequately explored. Herein, we propose a novel multimodal
EPG-to-speech (EPG2S) system that utilizes EPG and speech signals for speech
generation and enhancement. …
arxiv audio generation learning multimodal multimodal learning speech