[D] Where do Lora memory gains come from ? (apart from the optimizer state) | allainews.com

March 10, 2024, 6:12 p.m. | /u/Wats0ns

Machine Learning www.reddit.com

Hello,

I've been reading explanations of Lora for hours now, and there is something I can't wrap my head around: the memory gains. I understand that a lot is gained with the optimizer state that are not needed for frozen layers.

However, in the Lora Paper (Chapter 4.2) it is stated that

>We also observe a 25% speedup during training on GPT-3 175B compared to full fine-tuning5 as we do not need to calculate the gradient for the vast majority …

head hello however lora machinelearning memory paper reading something state

More from www.reddit.com / Machine Learning

[N] GPT-4o 4 hours ago | www.reddit.com

arena chatbot chatbot arena current +8

ML Feature Compression [D] 7 hours ago | www.reddit.com

autoencoders compression etc feature +7

[R] Our new classification algorithm outperforms CatBoost, XGBoost, LightGBM on five benchmark datasets, on accuracy … 12 hours ago | www.reddit.com

accuracy algorithm algorithms benchmark +18

[D] Thoughts on DSPy 18 hours ago | www.reddit.com

core dspy explore imagine +8

[D] Please consider signing this letter to open source AlphaFold3 20 hours ago | www.reddit.com

acid alphafold bioinformatics capability +13

[P] SimpleGEMM: Fast and minimal tensor core matrix multiplication in CUDA 1 day, 1 hour ago | www.reddit.com

architecture code core cuda +10

[P] I made a website that visualizes your codebase with LLMs 1 day, 2 hours ago | www.reddit.com

codebase llms machinelearning website

[P] DARWIN - open-sourced Devin alternative 1 day, 4 hours ago | www.reddit.com

access ai software ai software engineer alternative +16

[R] How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with … 1 day, 8 hours ago | www.reddit.com

abstract machinelearning

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net