all AI news
Aligning LLM Agents by Learning Latent Preference from User Edits
April 24, 2024, 4:43 a.m. | Ge Gao, Alexey Taymanov, Eduardo Salinas, Paul Mineiro, Dipendra Misra
cs.LG updates on arXiv.org arxiv.org
Abstract: We study interactive learning of language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optionally edit the agent response to personalize it based on their latent preference, in addition to improving the correctness. The edit feedback is naturally generated, making it a suitable candidate for improving the agent's alignment …
abstract agent agents arxiv assistants context cs.ai cs.cl cs.ir cs.lg edit generate interactive language llm study type writing
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Consultant - Artificial Intelligence & Data (Google Cloud Data Engineer) - MY / TH
@ Deloitte | Kuala Lumpur, MY