June 10, 2024, 4:42 a.m. | Dongyoung Kim, Kimin Lee, Jinwoo Shin, Jaehyung Kim

cs.CL updates on arXiv.org arxiv.org

arXiv:2406.04412v1 Announce Type: cross
Abstract: Aligning large language models (LLMs) with human preferences becomes a key component to obtaining state-of-the-art performance, but it yields a huge cost to construct a large human-annotated preference dataset. To tackle this problem, we propose a new framework that boosts the alignment of LLMs through Self-generated Preference data (Selfie) using only a very small amount of human-annotated preference data. Our key idea is leveraging the human prior knowledge within the small (seed) data and progressively …

abstract alignment art arxiv construct cost cs.ai cs.cl cs.lg data dataset framework generated human key language language models large language large language models llms performance problem state through type

Senior Data Engineer

@ Displate | Warsaw

Principal Software Engineer

@ Microsoft | Prague, Prague, Czech Republic

Sr. Global Reg. Affairs Manager

@ BASF | Research Triangle Park, NC, US, 27709-3528

Senior Robot Software Developer

@ OTTO Motors by Rockwell Automation | Kitchener, Ontario, Canada

Coop - Technical Service Hub Intern

@ Teradyne | Santiago de Queretaro, MX

Coop - Technical - Service Inside Sales Intern

@ Teradyne | Santiago de Queretaro, MX