Oct. 20, 2023, 8:02 p.m. | Pere Martra

Towards AI - Medium pub.towardsai.net

Let’s explore how Quantization works and provide an example of fine-tuning a 7-billion-parameter Bloom model on a T4 16GB GPU in Google Colab.

This article is part of a free course about Large Language Models available on GitHub.

Image Generated by Author with Dall-E2

We are going to combine a weight reduction technique for models, such as Quantization, with a parameter-efficient fine-tuning technique like LoRA. The result of this combination is QLoRA, which allows us to fine-tune large …

fine tuning llm large language models qlora quantization

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US