Multi-Modal Pre-Training for Automated Speech Recognition. (arXiv:2110.09890v2 [eess.AS] UPDATED) | allainews.com

Sept. 19, 2022, 1:12 a.m. | David M. Chan, Shalini Ghosh, Debmalya Chakrabarty, Björn Hoffmeister

cs.LG updates on arXiv.org arxiv.org

Traditionally, research in automated speech recognition has focused on
local-first encoding of audio representations to predict the spoken phonemes in
an utterance. Unfortunately, approaches relying on such hyper-local information
tend to be vulnerable to both local-level corruption (such as audio-frame
drops, or loud noises) and global-level noise (such as environmental noise, or
background noise) that has not been seen during training. In this work, we
introduce a novel approach which leverages a self-supervised learning technique
based on masked language modeling …

arxiv automated speech recognition pre-training speech speech recognition training

More from arxiv.org / cs.LG updates on arXiv.org

CascadedGaze: Efficiency in Global Context Extraction for Image Restoration 12 hours ago | arxiv.org

abstract arxiv attention attention mechanisms +23

Link Me Baby One More Time: Social Music Discovery on Spotify 12 hours ago | arxiv.org

abstract arxiv baby cs.ir +15

Risk-anticipatory autonomous driving strategies considering vehicles' weights, based on hierarchical deep reinforcement learning 12 hours ago | arxiv.org

abstract accidents arxiv autonomous +20

An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models 12 hours ago | arxiv.org

abstract annotation arxiv capabilities +21

Toward Deep Drum Source Separation 12 hours ago | arxiv.org

abstract adoption applications arxiv +14

CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor 12 hours ago | arxiv.org

abstract arxiv capacity clip +21

Towards Optimal Sobolev Norm Rates for the Vector-Valued Regularized Least-Squares Algorithm 12 hours ago | arxiv.org

abstract algorithm arxiv case +14

Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data Scenarios 12 hours ago | arxiv.org

abstract arxiv challenges cs.ai +15

SySMOL: Co-designing Algorithms and Hardware for Neural Networks with Heterogeneous Precisions 12 hours ago | arxiv.org

abstract accuracy algorithms arxiv +14

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net