April 4, 2024, 4:45 p.m. | MLOps.community

MLOps.community www.youtube.com

// Abstract
Discover the essential steps in transitioning LLMs from research to production, with a focus on effective fine-tuning and alignment strategies. This session delves into how to fine-tune & evaluate LLMs with Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF)/Direct Preference Optimization (DPO), and their practical applications for aligning LLMs with production goals.

// Bio
Philipp Schmid is a Technical Lead at Hugging Face with the mission to democratize good machine learning through open source and open science. …

abstract alignment direct preference optimization feedback fine-tuning focus human human feedback llms optimization production reinforcement reinforcement learning research rlhf session sft strategies supervised fine-tuning

More from www.youtube.com / MLOps.community

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US