FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback | allainews.com

April 9, 2024, 4:47 a.m. | Liqiang Jing, Xinya Du

cs.CV updates on arXiv.org arxiv.org

arXiv:2404.05046v1 Announce Type: new
Abstract: Large Vision-Language Models (LVLMs) have demonstrated proficiency in tackling a variety of visual-language tasks. However, current LVLMs suffer from misalignment between text and image modalities which causes three kinds of hallucination problems, i.e., object existence, object attribute, and object relationship. To tackle this issue, existing methods mainly utilize Reinforcement Learning (RL) to align modalities in LVLMs. However, they still suffer from three main limitations: (1) General feedback can not indicate the hallucination type contained in …

abstract arxiv cs.cl cs.cv current feedback fine-grained hallucination however image issue language language models object relationship tasks text type vision vision-language models visual

More from arxiv.org / cs.CV updates on arXiv.org

NOLA: Compressing LoRA using Linear Combination of Random Basis 21 hours ago | arxiv.org

arxiv combination cs.cl cs.cv +4

ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig 21 hours ago | arxiv.org

abstract arxiv become cs.cv +7

An Effective Image Copy-Move Forgery Detection Using Entropy Information 21 hours ago | arxiv.org

abstract academic algorithms arxiv +20

SimAC: A Simple Anti-Customization Method for Protecting Face Privacy against Text-to-Image Synthesis of Diffusion Models 21 hours ago | arxiv.org

arxiv cs.cv customization diffusion +9

SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification 21 hours ago | arxiv.org

arxiv cs.cv dataset identification +1

Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading 21 hours ago | arxiv.org

abstract arxiv cs.ai cs.cv +17

Conditioning Generative Latent Optimization for Sparse-View CT Image Reconstruction 21 hours ago | arxiv.org

abstract arxiv benefit cs.cv +17

Fast and Accurate Unknown Object Instance Segmentation through Error-Informed Refinement 21 hours ago | arxiv.org

abstract arxiv autonomous autonomous robots +17

Instance-dependent Noisy-label Learning with Graphical Model Based Noise-rate Estimation 21 hours ago | arxiv.org

abstract arxiv challenge cs.cv +10

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Scientist

@ Publicis Groupe | New York City, United States

View on ai-jobs.net

Bigdata Cloud Developer - Spark - Assistant Manager

@ State Street | Hyderabad, India

View on ai-jobs.net