all AI news
Getting More for Less: Using Weak Labels and AV-Mixup for Robust Audio-Visual Speaker Verification
June 14, 2024, 4:47 a.m. | Anith Selvakumar, Homa Fashandi
cs.LG updates on arXiv.org arxiv.org
Abstract: Distance Metric Learning (DML) has typically dominated the audio-visual speaker verification problem space, owing to strong performance in new and unseen classes. In our work, we explored multitask learning techniques to further enhance DML, and show that an auxiliary task with even weak labels can increase the quality of the learned speaker representation without increasing model complexity during inference. We also extend the Generalized End-to-End Loss (GE2E) to multimodal inputs and demonstrate that it can …
abstract arxiv audio cs.cv cs.lg cs.mm cs.sd dml eess.as labels multitask learning performance problem replace robust show space speaker type verification visual work
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Data Engineer
@ Displate | Warsaw
Senior Algorithms Engineer (Image Processing)
@ KLA | USA-MI-Ann Arbor-KLA
Principal Software Development Engineer
@ Yahoo | US - United States of America
Data Domain Architect, Vice President
@ JPMorgan Chase & Co. | Columbus, OH, United States
Senior, Data Scientist, Sam's Personalization
@ Cox Enterprises | (USA) TX MCKINNEY 04906 SAM'S CLUB
Software Engineering Specialist
@ GE HealthCare | Bengaluru HEALTHCARE (JFWTC) IN