all AI news
What Makes Quantization for Large Language Models Hard? An Empirical Study from the Lens of Perturbation
March 12, 2024, 4:42 a.m. | Zhuocheng Gong, Jiahao Liu, Jingang Wang, Xunliang Cai, Dongyan Zhao, Rui Yan
cs.LG updates on arXiv.org arxiv.org
Abstract: Quantization has emerged as a promising technique for improving the memory and computational efficiency of large language models (LLMs). Though the trade-off between performance and efficiency is well-known, there is still much to be learned about the relationship between quantization and LLM performance. To shed light on this relationship, we propose a new perspective on quantization, viewing it as perturbations added to the weights and activations of LLMs. We call this approach "the lens of …
abstract arxiv computational cs.ai cs.lg efficiency language language models large language large language models llms memory performance quantization relationship study trade trade-off type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer
@ GPTZero | Toronto, Canada
ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)
@ HelloBetter | Remote
Doctoral Researcher (m/f/div) in Automated Processing of Bioimages
@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Cloud Data Platform Engineer
@ First Central | Home Office (Remote)
Associate Director, Data Science
@ MSD | USA - New Jersey - Rahway