March 12, 2024, 4:42 a.m. | Zhuocheng Gong, Jiahao Liu, Jingang Wang, Xunliang Cai, Dongyan Zhao, Rui Yan

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.06408v1 Announce Type: new
Abstract: Quantization has emerged as a promising technique for improving the memory and computational efficiency of large language models (LLMs). Though the trade-off between performance and efficiency is well-known, there is still much to be learned about the relationship between quantization and LLM performance. To shed light on this relationship, we propose a new perspective on quantization, viewing it as perturbations added to the weights and activations of LLMs. We call this approach "the lens of …

abstract arxiv computational cs.ai cs.lg efficiency language language models large language large language models llms memory performance quantization relationship study trade trade-off type

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Cloud Data Platform Engineer

@ First Central | Home Office (Remote)

Associate Director, Data Science

@ MSD | USA - New Jersey - Rahway