Feb. 9, 2024, 5:46 a.m. | Kathleen C. Fraser Svetlana Kiritchenko

cs.CV updates on arXiv.org arxiv.org

Following on recent advances in large language models (LLMs) and subsequent chat models, a new wave of large vision-language models (LVLMs) has emerged. Such models can incorporate images as input in addition to text, and perform tasks such as visual question answering, image captioning, story generation, etc. Here, we examine potential gender and racial biases in such systems, based on the perceived characteristics of the people in the input images. To accomplish this, we present a new dataset PAIRS (PArallel …

advances bias captioning chat cs.cl cs.cv cs.cy dataset gender image images language language models large language large language models llms novel question question answering racial bias tasks text vision vision-language models visual

