all AI news
QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning
March 12, 2024, 4:48 a.m. | Jiun-Man Chen, Yu-Hsuan Chao, Yu-Jie Wang, Ming-Der Shieh, Chih-Chung Hsu, Wei-Fen Lin
cs.CV updates on arXiv.org arxiv.org
Abstract: Transformer-based models have gained widespread popularity in both the computer vision (CV) and natural language processing (NLP) fields. However, significant challenges arise during post-training linear quantization, leading to noticeable reductions in inference accuracy. Our study focuses on uncovering the underlying causes of these accuracy drops and proposing a quantization-friendly fine-tuning method, \textbf{QuantTune}. Firstly, our analysis revealed that, on average, 65\% of quantization errors result from the precision loss incurred by the dynamic range amplification effect …
abstract accuracy and natural language processing arxiv challenges computer computer vision cs.cv cs.mm fields however inference language language processing linear natural natural language natural language processing nlp outlier processing quantization study training transformer type vision
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote