[P] 80% faster, 50% less memory, 0% loss in accuracy Llama finetuning | allainews.com

Dec. 1, 2023, 4:31 p.m. | /u/danielhanchen

Machine Learning www.reddit.com

Hey [r/MachineLearning](https://www.reddit.com/r/MachineLearning/)!

I manually derived backpropagation steps, did some chained matrix multiplication optims, wrote all kernels in OpenAI's Triton language and did more maths and coding trickery to make QLoRA finetuning for Llama 5x faster on Unsloth: [https://github.com/unslothai/unsloth](https://github.com/unslothai/unsloth)! Some highlights:

* **5x faster** (5 hours to 1 hour)
* Use **50% less memory**
* With **0% loss in accuracy**
* All **locally** on NVIDIA GPUs (Tesla T4, RTX 20/30/40, Ampere, Hopper) for **free**!
* QLoRA / LoRA is now 80% …

accuracy ampere examples faster free gpus hopper hour lora loss machinelearning memory nvidia nvidia gpus orca qlora rtx tesla train trains

More from www.reddit.com / Machine Learning

[N] AI engineers report burnout and rushed rollouts as ‘rat race’ to stay competitive hits … 11 hours ago | www.reddit.com

ai tools article artificial artificial intelligence +17

[D] software to design figures 13 hours ago | www.reddit.com

algorithms alphatensor alphazero create +11

[D] How to train a text detection model that will detect it's orientation (rotation) ranging … 14 hours ago | www.reddit.com

case convention detection image +6

[R] HGRN2: Gated Linear RNNs with State Expansion 19 hours ago | www.reddit.com

abstract attention expansion however +15

[R] A Primer on the Inner Workings of Transformer-based Language Models 19 hours ago | www.reddit.com

abstract advanced authors insights +9

[D] Fine-tune Phi-3 model for domain specific data - seeking advice and insights 21 hours ago | www.reddit.com

accuracy advice benchmark data +11

[R] Iterative Reasoning Preference Optimization 1 day, 1 hour ago | www.reddit.com

iterative machinelearning optimization reasoning

[D] Good strategies / resources to improve MLOps skills as a PhD student / researcher 1 day, 7 hours ago | www.reddit.com

eventually good index industry +12

[Discussion] Should I go to ICML and present my paper? 1 day, 7 hours ago | www.reddit.com

academia data data scientist future +10

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)

@ takealot.com | Cape Town

View on ai-jobs.net