Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation | allainews.com

April 30, 2024, 4:51 a.m. | Kun Wei, Bei Li, Hang Lv, Quan Lu, Ning Jiang, Lei Xie

cs.CL updates on arXiv.org arxiv.org

arXiv:2310.14278v2 Announce Type: replace-cross
Abstract: Automatic Speech Recognition (ASR) in conversational settings presents unique challenges, including extracting relevant contextual information from previous conversational turns. Due to irrelevant content, error propagation, and redundancy, existing methods struggle to extract longer and more effective contexts. To address this issue, we introduce a novel conversational ASR system, extending the Conformer encoder-decoder model with cross-modal conversational representation. Our approach leverages a cross-modal extractor that combines pre-trained speech and text models through a specialized encoder and …

abstract arxiv asr audio automatic speech recognition challenges conversational cs.cl cs.sd eess.as error extract information issue modal novel propagation recognition redundancy representation speech speech recognition struggle textual type unique

More from arxiv.org / cs.CL updates on arXiv.org

Biomedical knowledge graph-optimized prompt generation for large language models 14 hours ago | arxiv.org

abstract arxiv biomedical biomedicine +27

Primacy Effect of ChatGPT 14 hours ago | arxiv.org

arxiv chatgpt cs.ai cs.cl +2

Are Models Trained on Indian Legal Data Fair? 14 hours ago | arxiv.org

abstract advances applications artificial +27

Silver-Tongued and Sundry: Exploring Intersectional Pronouns with ChatGPT 14 hours ago | arxiv.org

abstract agent arxiv chatgpt +13

Exploring the Potential of Conversational AI Support for Agent-Based Social Simulation Model Design 14 hours ago | arxiv.org

abstract agent ai-powered ai systems +21

Robot Detection System 1: Front-Following 14 hours ago | arxiv.org

abstract advantages arxiv cs.cl +14

Refinement of an Epilepsy Dictionary through Human Annotation of Health-related posts on Instagram 14 hours ago | arxiv.org

abstract annotation arxiv biomedical +12

Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in … 14 hours ago | arxiv.org

abstract arxiv beyond cs.ai +15

From Text to Context: An Entailment Approach for News Stakeholder Classification 14 hours ago | arxiv.org

abstract actors articles arxiv +13

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net