RLHF in 2024 with DPO & Hugging Face

Jan. 23, 2024, midnight | schmidphilipp1995@gmail.com (Philipp Schmid)

In this blog post you will learn how to align LLMs using Hugging Face TRL and RLHF through Direct Preference Optimization (DPO).

blog direct preference optimization face generativeai hugging face huggingface learn llm llms optimization rlhf through will

Visit resource

More from www.philschmid.de / philschmid blog

Efficiently fine-tune Llama 3 with PyTorch FSDP and Q-Lora 3 weeks, 2 days ago | www.philschmid.de

70b datasets face generativeai +11

Deploy Llama 3 on Amazon SageMaker 3 weeks, 6 days ago | www.philschmid.de

70b amazon amazon sagemaker blog +9

Accelerate Mixtral 8x7B with Speculative Decoding and Quantziation on Amazon SageMaker 1 month, 1 week ago | www.philschmid.de

amazon amazon sagemaker blog decoding +9

Deploy Llama 2 70B on AWS Inferentia2 with Hugging Face Optimum 1 month, 2 weeks ago | www.philschmid.de

70b amazon amazon sagemaker aws +16

Fine-Tune & Evaluate LLMs in 2024 with Amazon SageMaker 2 months ago | www.philschmid.de

amazon amazon sagemaker blog face +8

Evaluate LLMs with Hugging Face Lighteval on Amazon SageMaker 2 months, 1 week ago | www.philschmid.de

amazon amazon sagemaker blog face +8

How to fine-tune Google Gemma with ChatML and Hugging Face TRL 2 months, 2 weeks ago | www.philschmid.de

blog datasets face gemma +10

RLHF in 2024 with DPO & Hugging Face 3 months, 3 weeks ago | www.philschmid.de

blog direct preference optimization face generativeai +9

How to Fine-Tune LLMs in 2024 with Hugging Face 3 months, 3 weeks ago | www.philschmid.de

blog dataset datasets face +11

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

all AI news

RLHF in 2024 with DPO & Hugging Face

More from www.philschmid.de / philschmid blog

Jobs in AI, ML, Big Data

Data Engineer

Artificial Intelligence – Bioinformatic Expert

Lead Developer (AI)

Research Engineer

Ecosystem Manager

Founding AI Engineer, Agents