all AI news
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective
Feb. 23, 2024, 5:46 a.m. | Zihao Yue, Liang Zhang, Qin Jin
cs.CV updates on arXiv.org arxiv.org
Abstract: Large Multimodal Models (LMMs) often suffer from multimodal hallucinations, wherein they may create content that is not present in the visual inputs. In this paper, we explore a new angle of this issue: overly detailed training data hinders the model's ability to timely terminate generation, leading to continued outputs beyond visual perception limits. By investigating how the model decides to terminate generation with EOS, the special end-of-sentence token, we find that the model assesses the …
abstract arxiv cs.cl cs.cv data decision explore hallucination hallucinations inputs issue large multimodal models lmms multimodal multimodal models paper perspective training training data type visual
More from arxiv.org / cs.CV updates on arXiv.org
Retrieval-Augmented Egocentric Video Captioning
2 days, 9 hours ago |
arxiv.org
Mirror-Aware Neural Humans
2 days, 9 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US