Web: http://arxiv.org/abs/2209.11061

Sept. 23, 2022, 1:11 a.m. | Sina Alisamir, Fabien Ringeval, Francois Portet

cs.LG updates on arXiv.org arxiv.org

Voice Activity Detection (VAD) aims at detecting speech segments on an audio
signal, which is a necessary first step for many today's speech based
applications. Current state-of-the-art methods focus on training a neural
network exploiting features directly contained in the acoustics, such as Mel
Filter Banks (MFBs). Such methods therefore require an extra normalisation step
to adapt to a new domain where the acoustics is impacted, which can be simply
due to a change of speaker, microphone, or environment. In …

