Direct Preference Optimization: Forget RLHF (PPO) | allainews.com

June 6, 2023, noon | code_your_own_AI

code_your_own_AI www.youtube.com

DPO replaces RLHF: In this technical and informative video, we explore a groundbreaking methodology called direct preference optimization (DPO) by Stanford Univ that has the potential to replace reinforcement learning in the training of GPT systems.

Join us as we dive into the intricacies of direct preference optimization, dissecting its technical details and highlighting its advantages over the conventional reinforcement learning approach.

Discover how this innovative technique opens new possibilities in AI training, offering more precise control and improved performance. …

gpt highlighting join methodology optimization ppo reinforcement reinforcement learning rlhf stanford systems technical training video

More from www.youtube.com / code_your_own_AI

New Discovery: Retrieval Heads for Long Context 13 hours ago | www.youtube.com

applications attention context dev +15

Multi-Token Prediction (forget next token LLM?) 1 day, 13 hours ago | www.youtube.com

architecture autoregressive benchmark data +13

NEW LLM Test: Reasoning & gpt2-chatbot 2 days, 18 hours ago | www.youtube.com

blind causal chatbot gpt2-chatbot +8

LLMs: Rewriting Our Tomorrow (plus code) #ai 4 days, 1 hour ago | www.youtube.com

ai systems code effects future +10

Autonomous AI Agents: 14 % MAX Performance 5 days, 13 hours ago | www.youtube.com

agents ai agents autonomous autonomous agents +14

480B LLM as 128x4B MoE? WHY? 1 week ago | www.youtube.com

architecture architectures causal comparison +15

No more Fine-Tuning: Unsupervised ICL+ 1 week, 2 days ago | www.youtube.com

advanced autonomous context deepmind +17

NEW Phi-3 mini 3.8B LLM for Your PHONE: 1st TEST 1 week, 2 days ago | www.youtube.com

datasets llama llama 3 llm +9

BEST LLMs for Coding, Long Context, Overall Perform 1 week, 3 days ago | www.youtube.com

april benchmark benchmarks coding +12

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Data Science Analyst

@ Mayo Clinic | AZ, United States

View on ai-jobs.net

Sr. Data Scientist (Network Engineering)

@ SpaceX | Redmond, WA

View on ai-jobs.net