March 28, 2024, 4:41 a.m. | Leonardo Pepino, Pablo Riera, Luciana Ferrer, Agustin Gravano

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.18635v1 Announce Type: new
Abstract: In this paper, we study different approaches for classifying emotions from speech using acoustic and text-based features. We propose to obtain contextualized word embeddings with BERT to represent the information contained in speech transcriptions and show that this results in better performance than using Glove embeddings. We also propose and compare different strategies to combine the audio and text modalities, evaluating them on IEMOCAP and MSP-PODCAST datasets. We find that fusing acoustic and text-based systems …

abstract arxiv bert cs.lg cs.sd eess.as embeddings emotion emotions features fusion information paper performance recognition results show speech study text the information type word word embeddings

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US