June 27, 2024, 4:42 a.m. | Haiyang Sun, Fulin Zhang, Yingying Gao, Zheng Lian, Shilei Zhang, Junlan Feng

cs.CL updates on arXiv.org arxiv.org

arXiv:2306.09361v3 Announce Type: replace-cross
Abstract: Speech Emotion Recognition (SER) is an important research topic in human-computer interaction. Many recent works focus on directly extracting emotional cues through pre-trained knowledge, frequently overlooking considerations of appropriateness and comprehensiveness. Therefore, we propose a novel framework for pre-training knowledge in SER, called Multi-perspective Fusion Search Network (MFSN). Considering comprehensiveness, we partition speech knowledge into Textual-related Emotional Content (TEC) and Speech-related Emotional Content (SEC), capturing cues from both semantic and acoustic perspectives, and we design …

abstract arxiv computer cs.cl cs.sd eess.as emotion focus framework fusion human human-computer interaction important knowledge multi network novel perspective pre-training recognition replace research search speech speech emotion through training type

Quantitative Researcher – Algorithmic Research

@ Man Group | GB London Riverbank House

Software Engineering Expert

@ Sanofi | Budapest

Senior Bioinformatics Scientist

@ Illumina | US - Bay Area - Foster City

Senior Engineer - Generative AI Product Engineering (Remote-Eligible)

@ Capital One | McLean, VA

Graduate Assistant - Bioinformatics

@ University of Arkansas System | University of Arkansas at Little Rock

Senior AI-HPC Cluster Engineer

@ NVIDIA | US, CA, Santa Clara