all AI news
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
April 15, 2024, 4:41 a.m. | Tanmay Gautam, Youngsuk Park, Hao Zhou, Parameswaran Raman, Wooseok Ha
cs.LG updates on arXiv.org arxiv.org
Abstract: Fine-tuning language models (LMs) has demonstrated success in a wide array of downstream tasks. However, as LMs are scaled up, the memory requirements for backpropagation become prohibitively high. Zeroth-order (ZO) optimization methods can leverage memory-efficient forward passes to estimate gradients. More recently, MeZO, an adaptation of ZO-SGD, has been shown to consistently outperform zero-shot and in-context learning when combined with suitable task prompts. In this work, we couple ZO methods with variance reduction techniques to …
abstract array arxiv backpropagation become cs.ai cs.cl cs.lg fine-tuning however language language models lms math.oc memory optimization requirements success tasks type variance
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York