all AI news
DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
March 20, 2024, 4:43 a.m. | Yangyi Chen, Karan Sikka, Michael Cogswell, Heng Ji, Ajay Divakaran
cs.LG updates on arXiv.org arxiv.org
Abstract: We present DRESS, a large vision language model (LVLM) that innovatively exploits Natural Language feedback (NLF) from Large Language Models to enhance its alignment and interactions by addressing two key limitations in the state-of-the-art LVLMs. First, prior LVLMs generally rely only on the instruction finetuning stage to enhance alignment with human preferences. Without incorporating extra feedback, they are still prone to generate unhelpful, hallucinated, or harmful responses. Second, while the visual instruction tuning data is …
abstract alignment art arxiv cs.cl cs.cv cs.lg exploits feedback humans interactions key language language model language models large language large language models limitations natural natural language prior state type via vision vision language model vision-language models
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Principal Data Engineering Manager
@ Microsoft | Redmond, Washington, United States
Machine Learning Engineer
@ Apple | San Diego, California, United States