April 28, 2022, 1:11 a.m. | Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Zhuo Chen, Peidong Wang, Gang Liu, Jinyu Li, Jian Wu, Xiangzhan Yu, Furu Wei

cs.CL updates on arXiv.org arxiv.org

Recently, self-supervised learning (SSL) has demonstrated strong performance
in speaker recognition, even if the pre-training objective is designed for
speech recognition. In this paper, we study which factor leads to the success
of self-supervised learning on speaker-related tasks, e.g. speaker verification
(SV), through a series of carefully designed experiments. Our empirical results
on the Voxceleb-1 dataset suggest that the benefit of SSL to SV task is from a
combination of mask speech prediction loss, data scale, and model size, while …

arxiv learning self-supervised learning speech speech recognition supervised learning

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote