all AI news
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
April 2, 2024, 7:52 p.m. | Heng-Jui Chang, James Glass
cs.CL updates on arXiv.org arxiv.org
Abstract: This paper introduces Robust Spin (R-Spin), a data-efficient domain-specific self-supervision method for speaker and noise-invariant speech representations by learning discrete acoustic units with speaker-invariant clustering (Spin). R-Spin resolves Spin's issues and enhances content representations by learning to predict acoustic pieces. R-Spin offers a 12X reduction in computational resources compared to previous state-of-the-art methods while outperforming them in severely distorted speech scenarios. This paper provides detailed analyses to show how discrete units contribute to speech encoder …
abstract arxiv clustering cs.cl cs.sd data domain eess.as noise paper representation representation learning robust speaker speech spin supervision type units
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Research Scientist - XR Input Perception
@ Meta | Sausalito, CA | Redmond, WA | Burlingame, CA
Sr. Data Engineer
@ Oportun | Remote - India