all AI news
CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement. (arXiv:2209.11112v1 [cs.SD])
cs.LG updates on arXiv.org arxiv.org
Convolution-augmented transformers (Conformers) are recently proposed in
various speech-domain applications, such as automatic speech recognition (ASR)
and speech separation, as they can capture both local and global dependencies.
In this paper, we propose a conformer-based metric generative adversarial
network (CMGAN) for speech enhancement (SE) in the time-frequency (TF) domain.
The generator encodes the magnitude and complex spectrogram information using
two-stage conformer blocks to model both time and frequency dependencies. The
decoder then decouples the estimation into a magnitude mask decoder …