Fine-tune Falcon 180B with DeepSpeed ZeRO, LoRA & Flash Attention

Sept. 20, 2023, midnight | schmidphilipp1995@gmail.com (Philipp Schmid)

In this example we will show how to fine-tune Falcon 180B using DeepSpeed, Hugging Face Transformers, LoRA with Flash Attention on a multi-GPU machine.

attention deepspeed example face falcon falcon 180b flash generativeai gpu hugging face huggingface llm lora machine multi-gpu show transformers

Visit resource

More from www.philschmid.de / philschmid blog

Efficiently fine-tune Llama 3 with PyTorch FSDP and Q-Lora 4 weeks ago | www.philschmid.de

70b datasets face generativeai +11

Deploy Llama 3 on Amazon SageMaker 1 month ago | www.philschmid.de

70b amazon amazon sagemaker blog +9

Accelerate Mixtral 8x7B with Speculative Decoding and Quantziation on Amazon SageMaker 1 month, 2 weeks ago | www.philschmid.de

amazon amazon sagemaker blog decoding +9

Deploy Llama 2 70B on AWS Inferentia2 with Hugging Face Optimum 1 month, 3 weeks ago | www.philschmid.de

70b amazon amazon sagemaker aws +16

Fine-Tune & Evaluate LLMs in 2024 with Amazon SageMaker 2 months, 1 week ago | www.philschmid.de

amazon amazon sagemaker blog face +8

Evaluate LLMs with Hugging Face Lighteval on Amazon SageMaker 2 months, 2 weeks ago | www.philschmid.de

amazon amazon sagemaker blog face +8

How to fine-tune Google Gemma with ChatML and Hugging Face TRL 2 months, 2 weeks ago | www.philschmid.de

blog datasets face gemma +10

RLHF in 2024 with DPO & Hugging Face 3 months, 3 weeks ago | www.philschmid.de

blog direct preference optimization face generativeai +9

How to Fine-Tune LLMs in 2024 with Hugging Face 3 months, 3 weeks ago | www.philschmid.de

blog dataset datasets face +11

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

all AI news

Fine-tune Falcon 180B with DeepSpeed ZeRO, LoRA & Flash Attention

More from www.philschmid.de / philschmid blog

Jobs in AI, ML, Big Data

Software Engineer for AI Training Data (School Specific)

Software Engineer for AI Training Data (Python)

Software Engineer for AI Training Data (Tier 2)

Data Engineer

Artificial Intelligence – Bioinformatic Expert

Lead Developer (AI)