Feb. 2, 2024, 3:46 p.m. | Elio GruttadauriaIP Paris, LTCI, IDS, S2A Mathieu FontaineLTCI, IP Paris Slim EssidIDS, S2A, LTCI

cs.LG updates on arXiv.org arxiv.org

Overlapped speech is notoriously problematic for speaker diarization systems. Consequently, the use of speech separation has recently been proposed to improve their performance. Although promising, speech separation models struggle with realistic data because they are trained on simulated mixtures with a fixed number of speakers. In this work, we introduce a new speech separation-guided diarization scheme suitable for the online speaker diarization of long meeting recordings with a variable number of speakers, as present in the AMI corpus. We envisage …

cs.lg cs.sd data diarization eess.as eess.sp meetings performance speaker speakers speech struggle systems work

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Data Scientist

@ ITE Management | New York City, United States