June 17, 2024, 4:47 a.m. | Chen Tang, Yuan Meng, Jiacheng Jiang, Shuzhao Xie, Rongwei Lu, Xinzhu Ma, Zhi Wang, Wenwu Zhu

cs.CV updates on arXiv.org arxiv.org

arXiv:2401.01543v2 Announce Type: replace
Abstract: Quantization is of significance for compressing the over-parameterized deep neural models and deploying them on resource-limited devices. Fixed-precision quantization suffers from performance drop due to the limited numerical representation ability. Conversely, mixed-precision quantization (MPQ) is advocated to compress the model effectively by allocating heterogeneous bit-width for layers. MPQ is typically organized into a searching-retraining two-stage process. In this paper, we devise a one-shot training-searching paradigm for mixed-precision model compression. Specifically, in the first stage, all …

arxiv cs.cv free quantization replace retraining type via

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

Senior Data Engineer

@ Displate | Warsaw

Staff Software Engineer (Data Platform)

@ Phaidra | Remote

Distributed Compute Engineer

@ Magic | San Francisco

Power Platform Developer/Consultant

@ Euromonitor | Bengaluru, Karnataka, India

Finance Project Senior Manager

@ QIMA | London, United Kingdom