Web: http://arxiv.org/abs/2205.05474

May 12, 2022, 1:11 a.m. | Hendrik Schröter, Alberto N. Escalante-B., Tobias Rosenkranz, Andreas Maier

cs.LG updates on arXiv.org arxiv.org

Deep learning-based speech enhancement has seen huge improvements and
recently also expanded to full band audio (48 kHz). However, many approaches
have a rather high computational complexity and require big temporal buffers
for real time usage e.g. due to temporal convolutions or attention. Both make
those approaches not feasible on embedded devices. This work further extends
DeepFilterNet, which exploits harmonic structure of speech allowing for
efficient speech enhancement (SE). Several optimizations in the training
procedure, data augmentation, and network structure …

arxiv audio devices embedded on real-time speech time

More from arxiv.org / cs.LG updates on arXiv.org

Predictive Ecology Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL