June 10, 2024, 4:42 a.m. | Dongyoung Kim, Kimin Lee, Jinwoo Shin, Jaehyung Kim

arXiv:2406.04412v1 Announce Type: cross
Abstract: Aligning large language models (LLMs) with human preferences becomes a key component to obtaining state-of-the-art performance, but it yields a huge cost to construct a large human-annotated preference dataset. To tackle this problem, we propose a new framework that boosts the alignment of LLMs through Self-generated Preference data (Selfie) using only a very small amount of human-annotated preference data. Our key idea is leveraging the human prior knowledge within the small (seed) data and progressively …

