May 8, 2024, 4:46 a.m. | Dogucan Yaman, Fevziye Irem Eyiokur, Leonard B\"armann, Seymanur Akt{\i}, Haz{\i}m Kemal Ekenel, Alexander Waibel

cs.CV updates on arXiv.org arxiv.org

arXiv:2405.04327v1 Announce Type: new
Abstract: In the task of talking face generation, the objective is to generate a face video with lips synchronized to the corresponding audio while preserving visual details and identity information. Current methods face the challenge of learning accurate lip synchronization while avoiding detrimental effects on visual quality, as well as robustly evaluating such synchronization. To tackle these problems, we propose utilizing an audio-visual speech representation expert (AV-HuBERT) for calculating lip synchronization loss during training. Moreover, leveraging …

abstract arxiv audio challenge cs.cv current effects evaluation expert face generate identity information representation speech synchronization type video video generation visual while

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US