all AI news
Fine-tuning wav2vec2 for speaker recognition. (arXiv:2109.15053v2 [cs.SD] UPDATED)
Web: http://arxiv.org/abs/2109.15053
May 9, 2022, 1:11 a.m. | Nik Vaessen, David A. van Leeuwen
cs.LG updates on arXiv.org arxiv.org
This paper explores applying the wav2vec2 framework to speaker recognition
instead of speech recognition. We study the effectiveness of the pre-trained
weights on the speaker recognition task, and how to pool the wav2vec2 output
sequence into a fixed-length speaker embedding. To adapt the framework to
speaker recognition, we propose a single-utterance classification variant with
CE or AAM softmax loss, and an utterance-pair classification variant with BCE
loss. Our best performing variant, w2v2-aam, achieves a 1.88% EER on the
extended voxceleb1 …
More from arxiv.org / cs.LG updates on arXiv.org
Latest AI/ML/Big Data Jobs
Director, Applied Mathematics & Computational Research Division
@ Lawrence Berkeley National Lab | Berkeley, Ca
Business Data Analyst
@ MainStreet Family Care | Birmingham, AL
Assistant/Associate Professor of the Practice in Business Analytics
@ Georgetown University McDonough School of Business | Washington DC
Senior Data Science Writer
@ NannyML | Remote
Director of AI/ML Engineering
@ Armis Industries | Remote (US only), St. Louis, California
Digital Analytics Manager
@ Patagonia | Ventura, California