all AI news
Retraining-free Model Quantization via One-Shot Weight-Coupling Learning
June 17, 2024, 4:47 a.m. | Chen Tang, Yuan Meng, Jiacheng Jiang, Shuzhao Xie, Rongwei Lu, Xinzhu Ma, Zhi Wang, Wenwu Zhu
cs.CV updates on arXiv.org arxiv.org
Abstract: Quantization is of significance for compressing the over-parameterized deep neural models and deploying them on resource-limited devices. Fixed-precision quantization suffers from performance drop due to the limited numerical representation ability. Conversely, mixed-precision quantization (MPQ) is advocated to compress the model effectively by allocating heterogeneous bit-width for layers. MPQ is typically organized into a searching-retraining two-stage process. In this paper, we devise a one-shot training-searching paradigm for mixed-precision model compression. Specifically, in the first stage, all …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
AI Focused Biochemistry Postdoctoral Fellow
@ Lawrence Berkeley National Lab | Berkeley, CA
Senior Data Engineer
@ Displate | Warsaw
Staff Software Engineer (Data Platform)
@ Phaidra | Remote
Distributed Compute Engineer
@ Magic | San Francisco
Power Platform Developer/Consultant
@ Euromonitor | Bengaluru, Karnataka, India
Finance Project Senior Manager
@ QIMA | London, United Kingdom