Typographic Attacks in Large Multimodal Models Can be Alleviated by More Informative Prompts | allainews.com

March 1, 2024, 5:47 a.m. | Hao Cheng, Erjia Xiao, Renjing Xu

cs.CV updates on arXiv.org arxiv.org

arXiv:2402.19150v1 Announce Type: new
Abstract: Large Multimodal Models (LMMs) rely on pre-trained Vision Language Models (VLMs) and Large Language Models (LLMs) to perform amazing emergent abilities on various multimodal tasks in the joint space of vision and language. However, the Typographic Attack, which shows disruption to VLMs, has also been certified as a security vulnerability to LMMs. In this work, we first comprehensively investigate the distractibility of LMMs by typography. In particular, we introduce the Typographic Dataset designed to evaluate …

abstract arxiv attacks cs.cv disruption language language models large language large language models large multimodal models llms lmms multimodal multimodal models prompts shows space tasks type vision vlms

More from arxiv.org / cs.CV updates on arXiv.org

GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration 21 hours ago | arxiv.org

abstract arxiv cs.cl cs.cv +25

Dynamic Open Vocabulary Enhanced Safe-landing with Intelligence (DOVESEI) 21 hours ago | arxiv.org

abstract arxiv attention cs.ai +16

CoVid-19 Detection leveraging Vision Transformers and Explainable AI 21 hours ago | arxiv.org

abstract arxiv covid covid-19 +19

SAR image matching algorithm based on multi-class features 21 hours ago | arxiv.org

abstract algorithm application arxiv +13

Enhancing Sign Language Teaching: A Mixed Reality Approach for Immersive Learning and Multi-Dimensional Feedback 21 hours ago | arxiv.org

abstract arxiv challenges classroom +13

A Linear Time and Space Local Point Cloud Geometry Encoder via Vectorized Kernel Mixture (VecKM) 21 hours ago | arxiv.org

abstract arxiv cloud compute +11

UP-CrackNet: Unsupervised Pixel-Wise Road Crack Detection via Adversarial Image Restoration 21 hours ago | arxiv.org

abstract adversarial algorithms arxiv +21

AttributionScanner: A Visual Analytics System for Model Validation with Metadata-Free Slice Finding 21 hours ago | arxiv.org

abstract analytics arxiv context +19

FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes 21 hours ago | arxiv.org

abstract applications arxiv attention +15

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net