Feb. 8, 2024, 5:46 a.m. | Ju-Chieh Chou Chung-Ming Chien Karen Livescu

cs.CL updates on arXiv.org arxiv.org

Speech enhancement systems are typically trained using pairs of clean and noisy speech. In audio-visual speech enhancement (AVSE), there is not as much ground-truth clean data available; most audio-visual datasets are collected in real-world environments with background noise and reverberation, hampering the development of AVSE. In this work, we introduce AV2Wav, a resynthesis-based audio-visual speech enhancement approach that can generate clean speech despite the challenges of real-world training data. We obtain a subset of nearly clean speech from an audio-visual …

cs.cl cs.sd eess.as

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

HPC Engineer (x/f/m) - DACH

@ Meshcapade GmbH | Remote, Germany

Data Scientist AI / ML - Associate 2 -Bangalore

@ PwC | Bengaluru (SDC) - Bagmane Tech Park

Staff ML Engineer - Machine Learning

@ Visa | Bengaluru, India

Senior Data Scientist

@ IQVIA | Dublin, Ireland

Data Analyst ETL Expert

@ Bosch Group | Bengaluru, India