Nov. 5, 2023, 6:47 a.m. | Youyuan Zhang, Sashank Gondala, Thiago Fraga-Silva, Christophe Van Gysel

cs.CL updates on arXiv.org arxiv.org

On-device Virtual Assistants (VAs) powered by Automatic Speech Recognition
(ASR) require effective knowledge integration for the challenging entity-rich
query recognition. In this paper, we conduct an empirical study of modeling
strategies for server-side rescoring of spoken information domain queries using
various categories of Language Models (LMs) (N-gram word LMs, sub-word neural
LMs). We investigate the combination of on-device and server-side signals, and
demonstrate significant WER improvements of 23%-35% on various entity-centric
query subpopulations by integrating various server-side LMs compared to …

arxiv asr assistants automatic speech recognition domain information integration knowledge language language models modeling paper query recognition server speech speech recognition spoken strategies study virtual virtual assistants word

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Data Scientist (Database Development)

@ Nasdaq | Bengaluru-Affluence