Jan. 31, 2024, 4:43 p.m. | Qingcheng Zhao, Pengyu Long, Qixuan Zhang, Dafei Qin, Han Liang, Longwen Zhang, Yingliang Zhang, Jingyi Yu, Lan Xu

cs.CV updates on arXiv.org arxiv.org

The synthesis of 3D facial animations from speech has garnered considerable
attention. Due to the scarcity of high-quality 4D facial data and
well-annotated abundant multi-modality labels, previous methods often suffer
from limited realism and a lack of lexible conditioning. We address this
challenge through a trilogy. We first introduce Generalized Neural Parametric
Facial Asset (GNPFA), an efficient variational auto-encoder mapping facial
geometry and images to a highly generalized expression latent space, decoupling
expressions and identities. Then, we utilize GNPFA to …

animation animations arxiv attention challenge cs.cv data guidance labels quality speech synthesis through

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Lead Data Modeler

@ Sherwin-Williams | Cleveland, OH, United States