June 6, 2024, 4:43 a.m. | Aviv Shamsian, Aviv Navon, Neta Glazer, Gill Hetz, Joseph Keshet

cs.LG updates on arXiv.org arxiv.org

arXiv:2406.02649v1 Announce Type: cross
Abstract: Automatic Speech Recognition (ASR) technology has made significant progress in recent years, providing accurate transcription across various domains. However, some challenges remain, especially in noisy environments and specialized jargon. In this paper, we propose a novel approach for improved jargon word recognition by contextual biasing Whisper-based models. We employ a keyword spotting model that leverages the Whisper encoder representation to dynamically generate prompts for guiding the decoder during the transcription process. We introduce two approaches …

abstract arxiv asr automatic speech recognition challenges cs.lg cs.sd domains eess.as environments however jargon novel paper progress recognition speech speech recognition technology transcription type whisper word

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Senior Research Engineer/Specialist - Motor Mechanical Design

@ GKN Aerospace | Bristol, GB

Research Engineer (Motor Mechanical Design)

@ GKN Aerospace | Bristol, GB

Senior Research Engineer (Electromagnetic Design)

@ GKN Aerospace | Bristol, GB

Associate Research Engineer Clubs | Titleist

@ Acushnet Company | Carlsbad, CA, United States