Web: http://arxiv.org/abs/2107.10998

Jan. 27, 2022, 2:10 a.m. | Dan Liu, Xi Chen, Jie Fu, Chen Ma, Xue Liu

cs.CV updates on arXiv.org arxiv.org

Inference time, model size, and accuracy are three key factors in deep model
compression.


Most of the existing work addresses these three key factors separately as it
is difficult to optimize them all at the same time.


For example, low-bit quantization aims at obtaining a faster model; weight
sharing quantization aims at improving compression ratio and accuracy; and
mixed-precision quantization aims at balancing accuracy and inference time. To
simultaneously optimize bit-width, model size, and accuracy, we propose pruning
ternary quantization …

arxiv cv

More from arxiv.org / cs.CV updates on arXiv.org

Data Analyst, Credit Risk

@ Stripe | US Remote

Senior Data Engineer

@ Snyk | Cluj, Romania, or Remote

Senior Software Engineer (C++), Autonomy Visualization

@ Nuro, Inc. | Mountain View, California (HQ)

Machine Learning Intern (January 2023)

@ Cohere | Toronto, Palo Alto, San Francisco, London

Senior Machine Learning Engineer, Reinforcement Learning, Personalization

@ Spotify | New York, NY

AWS Data Engineer

@ ProCogia | Seattle