Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward | allainews.com

April 2, 2024, 7:48 p.m. | Ruohong Zhang, Liangke Gui, Zhiqing Sun, Yihao Feng, Keyang Xu, Yuanhan Zhang, Di Fu, Chunyuan Li, Alexander Hauptmann, Yonatan Bisk, Yiming Yang

cs.CV updates on arXiv.org arxiv.org

arXiv:2404.01258v1 Announce Type: new
Abstract: Preference modeling techniques, such as direct preference optimization (DPO), has shown effective in enhancing the generalization abilities of large language model (LLM). However, in tasks involving video instruction-following, providing informative feedback, especially for detecting hallucinations in generated responses, remains a significant challenge. Previous studies have explored using large large multimodal models (LMMs) as reward models to guide preference modeling, but their ability to accurately assess the factuality of generated responses compared to corresponding videos has …

abstract arxiv challenge cs.ai cs.cv direct preference optimization feedback generated hallucinations however language language model large language large language model large multimodal models llm modeling multimodal multimodal models optimization responses tasks type video

More from arxiv.org / cs.CV updates on arXiv.org

GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration 5 hours ago | arxiv.org

abstract arxiv cs.cl cs.cv +25

Dynamic Open Vocabulary Enhanced Safe-landing with Intelligence (DOVESEI) 5 hours ago | arxiv.org

abstract arxiv attention cs.ai +16

CoVid-19 Detection leveraging Vision Transformers and Explainable AI 5 hours ago | arxiv.org

abstract arxiv covid covid-19 +19

SAR image matching algorithm based on multi-class features 5 hours ago | arxiv.org

abstract algorithm application arxiv +13

Enhancing Sign Language Teaching: A Mixed Reality Approach for Immersive Learning and Multi-Dimensional Feedback 5 hours ago | arxiv.org

abstract arxiv challenges classroom +13

A Linear Time and Space Local Point Cloud Geometry Encoder via Vectorized Kernel Mixture (VecKM) 5 hours ago | arxiv.org

abstract arxiv cloud compute +11

UP-CrackNet: Unsupervised Pixel-Wise Road Crack Detection via Adversarial Image Restoration 5 hours ago | arxiv.org

abstract adversarial algorithms arxiv +21

AttributionScanner: A Visual Analytics System for Model Validation with Metadata-Free Slice Finding 5 hours ago | arxiv.org

abstract analytics arxiv context +19

FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes 5 hours ago | arxiv.org

abstract applications arxiv attention +15

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Consultant - Artificial Intelligence & Data (Google Cloud Data Engineer) - MY / TH

@ Deloitte | Kuala Lumpur, MY

View on ai-jobs.net