Web: http://arxiv.org/abs/2201.09429

Jan. 26, 2022, 2:11 a.m. | Xue Jiang, Xiulian Peng, Chengyu Zheng, Huaying Xue, Yuan Zhang, Yan Lu

cs.LG updates on arXiv.org arxiv.org

Deep-learning based methods have shown their advantages in audio coding over
traditional ones but limited attention has been paid on real-time
communications (RTC). This paper proposes the TFNet, an end-to-end neural audio
codec with low latency for RTC. It takes an encoder-temporal filtering-decoder
paradigm that seldom being investigated in audio coding. An interleaved
structure is proposed for temporal filtering to capture both short-term and
long-term temporal dependencies. Furthermore, with end-to-end optimization, the
TFNet is jointly optimized with speech enhancement and …

arxiv audio coding communications neural real-time time

More from arxiv.org / cs.LG updates on arXiv.org

Data Engineer, Buy with Prime

@ Amazon.com | Santa Monica, California, USA

Data Architect – Public Sector Health Data Architect, WWPS

@ Amazon.com | US, VA, Virtual Location - Virginia

[Job 8224] Data Engineer - Developer Senior

@ CI&T | Brazil

Software Engineer, Machine Learning, Planner/Behavior Prediction

@ Nuro, Inc. | Mountain View, California (HQ)

Lead Data Scientist

@ Inspectorio | Ho Chi Minh City, Ho Chi Minh City, Vietnam - Remote

Data Engineer

@ Craftable | Portugal - Remote