May 1, 2024, 4:46 a.m. | Songtao Luo, Shuang Yang, Shiguang Shan, Xilin Chen

cs.CV updates on arXiv.org arxiv.org

arXiv:2310.05058v3 Announce Type: replace
Abstract: In this paper, we propose a novel method for speaker adaptation in lip reading, motivated by two observations. Firstly, a speaker's own characteristics can always be portrayed well by his/her few facial images or even a single image with shallow networks, while the fine-grained dynamic features associated with speech content expressed by the talking face always need deep sequential networks to represent accurately. Therefore, we treat the shallow and deep layers differently for speaker adaptive …

abstract arxiv cs.ai cs.cv cs.sd dynamic eess.as features fine-grained her hidden image images lip reading networks novel paper reading speaker type while

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

.NET Software Engineer (AI Focus)

@ Boskalis | Papendrecht, Netherlands