Web: http://arxiv.org/abs/2206.07458

June 16, 2022, 1:13 a.m. | Joanna Hong, Minsu Kim, Yong Man Ro

cs.CV updates on arXiv.org arxiv.org

The goal of this work is to reconstruct speech from a silent talking face
video. Recent studies have shown impressive performance on synthesizing speech
from silent talking face videos. However, they have not explicitly considered
on varying identity characteristics of different speakers, which place a
challenge in the video-to-speech synthesis, and this becomes more critical in
unseen-speaker settings. Distinct from the previous methods, our approach is to
separate the speech content and the visage-style from a given silent talking
face …

arxiv cv feature feature selection speech video

More from arxiv.org / cs.CV updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY