April 19, 2024, 2:40 p.m. | Maxime Labonne

Towards Data Science - Medium towardsdatascience.com

A cheaper and faster unified fine-tuning technique

Image generated with DALL-E 3 by author

ORPO is a new exciting fine-tuning technique that combines the traditional supervised fine-tuning and preference alignment stages into a single process. This reduces the computational resources and time required for training. Moreover, empirical results demonstrate that ORPO outperforms other alignment methods on various model sizes and benchmarks.

In this article, we will fine-tune the new Llama 3 8B model using ORPO with the TRL library. The …

artificial intelligence editors pick hands-on-tutorials large language models machine learning

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Analyst

@ Alstom | Johannesburg, GT, ZA