Feb. 20, 2024, 5:42 a.m. | Riccardo Miccini, Alessandro Cerioli, Cl\'ement Laroche, Tobias Piechowiak, Jens Spars{\o}, Luca Pezzarossa

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.12263v1 Announce Type: new
Abstract: Despite the recent advances in model compression techniques for deep neural networks, deploying such models on ultra-low-power embedded devices still proves challenging. In particular, quantization schemes for Gated Recurrent Units (GRU) are difficult to tune due to their dependence on an internal state, preventing them from fully benefiting from sub-8bit quantization. In this work, we propose a modular integer quantization scheme for GRUs where the bit width of each operator can be selected independently. We …

8bit abstract advances algorithms arxiv compression cs.lg cs.ne devices eess.sp embedded embedded devices gru low mixed mixed-precision networks neural networks power precision quantization type units

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Scientist, gTech Ads

@ Google | Mexico City, CDMX, Mexico

Lead, Data Analytics Operations

@ Zocdoc | Pune, Maharashtra, India