all AI news
EnCodecMAE: Leveraging neural codecs for universal audio representation learning
May 22, 2024, 4:43 a.m. | Leonardo Pepino, Pablo Riera, Luciana Ferrer
cs.LG updates on arXiv.org arxiv.org
Abstract: The goal of universal audio representation learning is to obtain foundational models that can be used for a variety of downstream tasks involving speech, music and environmental sounds. To approach this problem, methods inspired by works on self-supervised learning for NLP, like BERT, or computer vision, like masked autoencoders (MAE), are often adapted to the audio domain. In this work, we propose masking representations of the audio signal, and training a MAE to reconstruct the …
abstract arxiv audio bert computer computer vision cs.lg cs.sd eess.as environmental foundational foundational models music nlp replace representation representation learning self-supervised learning speech supervised learning tasks type universal vision
More from arxiv.org / cs.LG updates on arXiv.org
Machine-learned models for magnetic materials
1 day, 21 hours ago |
arxiv.org
Revisiting RIP guarantees for sketching operators on mixture models
1 day, 21 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Senior Data Engineer
@ Displate | Warsaw
Analyst, Data Analytics
@ T. Rowe Price | Owings Mills, MD - Building 4
Regulatory Data Analyst
@ Federal Reserve System | San Francisco, CA
Sr. Data Analyst
@ Bank of America | Charlotte
Data Analyst- Tech Refresh
@ CACI International Inc | 1J5 WASHINGTON DC (BOLLING AFB)
Senior AML/CFT & Data Analyst
@ Ocorian | Ebène, Mauritius