SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space | allainews.com

May 10, 2024, 4:45 a.m. | Zeren Zhang, Haibo Qin, Jiayu Huang, Yixin Li, Hui Lin, Yitao Duan, Jinwen Ma

cs.CV updates on arXiv.org arxiv.org

arXiv:2405.05636v1 Announce Type: new
Abstract: Combining face swapping with lip synchronization technology offers a cost-effective solution for customized talking face generation. However, directly cascading existing models together tends to introduce significant interference between tasks and reduce video clarity because the interaction space is limited to the low-level semantic RGB space. To address this issue, we propose an innovative unified framework, SwapTalk, which accomplishes both face swapping and lip synchronization tasks in the same latent space. Referring to recent work on …

arxiv audio cs.ai cs.cv customization face space type

More from arxiv.org / cs.CV updates on arXiv.org

Towards Arbitrary-Scale Histopathology Image Super-resolution: An Efficient Dual-branch Framework via Implicit Self-texture Enhancement 9 hours ago | arxiv.org

abstract acquisition arxiv clinical +20

REBUS: A Robust Evaluation Benchmark of Understanding Symbols 9 hours ago | arxiv.org

abstract arxiv benchmark cities +23

Dreaming of Electrical Waves: Generative Modeling of Cardiac Excitation Waves using Diffusion Models 9 hours ago | arxiv.org

abstract arxiv cs.cv data +20

ASCNet: Asymmetric Sampling Correction Network for Infrared Image Destriping 9 hours ago | arxiv.org

arxiv cs.cv image network +3

Exposing Lip-syncing Deepfakes from Mouth Inconsistencies 9 hours ago | arxiv.org

arxiv cs.cv deepfakes replace +1

SSFlowNet: Semi-supervised Scene Flow Estimation On Point Clouds With Pseudo Label 9 hours ago | arxiv.org

abstract arxiv balance blend +11

CMOSE: Comprehensive Multi-Modality Online Student Engagement Dataset with High-Quality Labels 9 hours ago | arxiv.org

abstract arxiv challenges cs.ai +16

Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic Image-Report Generation 9 hours ago | arxiv.org

abstract alignment apply arxiv +21

FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models 9 hours ago | arxiv.org

abstract arxiv avatar capabilities +14

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Senior Machine Learning Engineer

@ BlackStone eIT | Egypt - Remote

View on ai-jobs.net

Machine Learning Engineer - 2

@ Parspec | Bengaluru, India

View on ai-jobs.net