all AI news
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection. (arXiv:2206.07458v1 [cs.CV])
Web: http://arxiv.org/abs/2206.07458
June 16, 2022, 1:10 a.m. | Joanna Hong, Minsu Kim, Yong Man Ro
cs.LG updates on arXiv.org arxiv.org
The goal of this work is to reconstruct speech from a silent talking face
video. Recent studies have shown impressive performance on synthesizing speech
from silent talking face videos. However, they have not explicitly considered
on varying identity characteristics of different speakers, which place a
challenge in the video-to-speech synthesis, and this becomes more critical in
unseen-speaker settings. Distinct from the previous methods, our approach is to
separate the speech content and the visage-style from a given silent talking
face …
More from arxiv.org / cs.LG updates on arXiv.org
Latest AI/ML/Big Data Jobs
Machine Learning Researcher - Saalfeld Lab
@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia
Project Director, Machine Learning in US Health
@ ideas42.org | Remote, US
Data Science Intern
@ NannyML | Remote
Machine Learning Engineer NLP/Speech
@ Play.ht | Remote
Research Scientist, 3D Reconstruction
@ Yembo | Remote, US
Clinical Assistant or Associate Professor of Management Science and Systems
@ University at Buffalo | Buffalo, NY