June 14, 2024, 4:47 a.m. | Anith Selvakumar, Homa Fashandi

cs.LG updates on arXiv.org arxiv.org

arXiv:2309.07115v2 Announce Type: replace-cross
Abstract: Distance Metric Learning (DML) has typically dominated the audio-visual speaker verification problem space, owing to strong performance in new and unseen classes. In our work, we explored multitask learning techniques to further enhance DML, and show that an auxiliary task with even weak labels can increase the quality of the learned speaker representation without increasing model complexity during inference. We also extend the Generalized End-to-End Loss (GE2E) to multimodal inputs and demonstrate that it can …

abstract arxiv audio cs.cv cs.lg cs.mm cs.sd dml eess.as labels multitask learning performance problem replace robust show space speaker type verification visual work

Senior Data Engineer

@ Displate | Warsaw

Senior Algorithms Engineer (Image Processing)

@ KLA | USA-MI-Ann Arbor-KLA

Principal Software Development Engineer

@ Yahoo | US - United States of America

Data Domain Architect, Vice President

@ JPMorgan Chase & Co. | Columbus, OH, United States

Senior, Data Scientist, Sam's Personalization

@ Cox Enterprises | (USA) TX MCKINNEY 04906 SAM'S CLUB

Software Engineering Specialist

@ GE HealthCare | Bengaluru HEALTHCARE (JFWTC) IN