June 7, 2024, 4:51 a.m. | Jiaming Zhou, Shiwan Zhao, Hui Wang, Tian-Hao Zhang, Haoqin Sun, Xuechen Wang, Yong Qin

cs.CL updates on arXiv.org arxiv.org

arXiv:2406.03814v1 Announce Type: new
Abstract: The kNN-CTC model has proven to be effective for monolingual automatic speech recognition (ASR). However, its direct application to multilingual scenarios like code-switching, presents challenges. Although there is potential for performance improvement, a kNN-CTC model utilizing a single bilingual datastore can inadvertently introduce undesirable noise from the alternative language. To address this, we propose a novel kNN-CTC-based code-switching ASR (CS-ASR) framework that employs dual monolingual datastores and a gated datastore selection mechanism to reduce noise …

abstract application arxiv asr automatic speech recognition bilingual challenges chinese code cs.cl cs.sd eess.as english however improvement improving knn multilingual performance potential recognition speech speech recognition type zero-shot

Senior Data Engineer

@ Displate | Warsaw

Solution Architect

@ Philips | Bothell - B2 - Bothell 22050

Senior Product Development Engineer - Datacenter Products

@ NVIDIA | US, CA, Santa Clara

Systems Engineer - 2nd Shift (Onsite)

@ RTX | PW715: Asheville Site W Asheville Greenfield Site TBD , Asheville, NC, 28803 USA

System Test Engineers (HW & SW)

@ Novanta | Barcelona, Spain

Senior Solutions Architect, Energy

@ NVIDIA | US, TX, Remote