all AI news
CMGAN: Conformer-based Metric GAN for Speech Enhancement
March 5, 2024, 2:45 p.m. | Ruizhe Cao, Sherif Abdulatif, Bin Yang
cs.LG updates on arXiv.org arxiv.org
Abstract: Recently, convolution-augmented transformer (Conformer) has achieved promising performance in automatic speech recognition (ASR) and time-domain speech enhancement (SE), as it can capture both local and global dependencies in the speech signal. In this paper, we propose a conformer-based metric generative adversarial network (CMGAN) for SE in the time-frequency (TF) domain. In the generator, we utilize two-stage conformer blocks to aggregate all magnitude and complex spectrogram information by modeling both time and frequency dependencies. The estimation …
abstract adversarial arxiv asr automatic speech recognition convolution cs.ai cs.lg cs.sd dependencies domain eess.as gan generative generative adversarial network global network paper performance recognition signal speech speech recognition transformer type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
#13721 - Data Engineer - AI Model Testing
@ Qualitest | Miami, Florida, United States
Elasticsearch Administrator
@ ManTech | 201BF - Customer Site, Chantilly, VA