all AI news
Efficient infusion of self-supervised representations in Automatic Speech Recognition
April 22, 2024, 4:46 a.m. | Darshan Prabhu, Sai Ganesh Mirishkar, Pankaj Wasnik
cs.CL updates on arXiv.org arxiv.org
Abstract: Self-supervised learned (SSL) models such as Wav2vec and HuBERT yield state-of-the-art results on speech-related tasks. Given the effectiveness of such models, it is advantageous to use them in conventional ASR systems. While some approaches suggest incorporating these models as a trainable encoder or a learnable frontend, training such systems is extremely slow and requires a lot of computation cycles. In this work, we propose two simple approaches that use (1) framewise addition and (2) cross-attention …
abstract art arxiv asr automatic speech recognition cs.cl encoder frontend recognition results speech speech recognition ssl state systems tasks them type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
DevOps Engineer (Data Team)
@ Reward Gateway | Sofia/Plovdiv