March 5, 2024, 2:50 p.m. | Chaoquan Jiang, Jinqiang Wang, Rui Hu, Jitao Sang

cs.CV updates on arXiv.org arxiv.org

arXiv:2312.05588v2 Announce Type: replace-cross
Abstract: Vision models with high overall accuracy often exhibit systematic errors in specific scenarios, posing potential serious safety concerns. Diagnosing bugs of vision models is gaining increased attention, however traditional diagnostic approaches require annotation efforts (eg rich metadata accompanying each samples of CelebA). To address this issue,We propose a language-assisted diagnostic method that uses texts instead of images to diagnose bugs in vision models based on multi-modal models (eg CLIP). Our approach connects the embedding space …

abstract accuracy annotation arxiv attention bugs concerns cs.ai cs.cv debugger diagnostic errors free language metadata safety sample samples type vision vision models

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Consultant Senior Power BI & Azure - CDI - H/F

@ Talan | Lyon, France