MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation | allainews.com

March 29, 2024, 4:45 a.m. | Seyeon Kim, Siyoon Jin, Jihye Park, Kihong Kim, Jiyoung Kim, Jisu Nam, Seungryong Kim

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.19144v1 Announce Type: new
Abstract: Conventional GAN-based models for talking head generation often suffer from limited quality and unstable training. Recent approaches based on diffusion models aimed to address these limitations and improve fidelity. However, they still face challenges, including extensive sampling times and difficulties in maintaining temporal consistency due to the high stochasticity of diffusion models. To overcome these challenges, we propose a novel motion-disentangled diffusion model for high-quality talking head generation, dubbed MoDiTalker. We introduce the two modules: …

abstract arxiv challenges cs.cv diffusion diffusion model diffusion models face fidelity gan head however limitations quality sampling talking head temporal training type

More from arxiv.org / cs.CV updates on arXiv.org

Gradient-based Local Next-best-view Planning for Improved Perception of Targeted Plant Nodes 17 hours ago | arxiv.org

abstract arxiv automate cs.cv +11

Radarize: Enhancing Radar SLAM with Generalizable Doppler-Based Odometry 17 hours ago | arxiv.org

abstract alternative arxiv challenges +17

Artificial Intelligence in Assessing Cardiovascular Diseases and Risk Factors via Retinal Fundus Images: A Review … 17 hours ago | arxiv.org

abstract analysis artificial artificial intelligence +14

BMAD: Benchmarks for Medical Anomaly Detection 17 hours ago | arxiv.org

anomaly anomaly detection arxiv benchmarks +5

Has the Virtualization of the Face Changed Facial Perception? A Study of the Impact of … 17 hours ago | arxiv.org

abstract arxiv augmented reality communication +14

Neural \'{E}tendue Expander for Ultra-Wide-Angle High-Fidelity Holographic Display 17 hours ago | arxiv.org

abstract applications arxiv augmented reality +14

Forensic Iris Image-Based Post-Mortem Interval Estimation 17 hours ago | arxiv.org

abstract application arxiv cs.cv +9

InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy Prediction 17 hours ago | arxiv.org

arxiv cs.cv cs.ro matrix +3

Amodal Ground Truth and Completion in the Wild 17 hours ago | arxiv.org

arxiv cs.cv truth type

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Developer AI Senior Staff Engineer, Machine Learning

@ Google | Sunnyvale, CA, USA; New York City, USA

View on ai-jobs.net

Engineer* Cloud & Data Operations (f/m/d)

@ SICK Sensor Intelligence | Waldkirch (bei Freiburg), DE, 79183

View on ai-jobs.net