all AI news
Visual Commonsense in Pretrained Unimodal and Multimodal Models. (arXiv:2205.01850v1 [cs.CL])
Web: http://arxiv.org/abs/2205.01850
May 5, 2022, 1:10 a.m. | Chenyu Zhang, Benjamin Van Durme, Zhuowan Li, Elias Stengel-Eskin
cs.CV updates on arXiv.org arxiv.org
Our commonsense knowledge about objects includes their typical visual
attributes; we know that bananas are typically yellow or green, and not purple.
Text and image corpora, being subject to reporting bias, represent this
world-knowledge to varying degrees of faithfulness. In this paper, we
investigate to what degree unimodal (language-only) and multimodal (image and
language) models capture a broad range of visually salient attributes. To that
end, we create the Visual Commonsense Tests (ViComTe) dataset covering 5
property types (color, shape, …
More from arxiv.org / cs.CV updates on arXiv.org
Latest AI/ML/Big Data Jobs
Director, Applied Mathematics & Computational Research Division
@ Lawrence Berkeley National Lab | Berkeley, Ca
Business Data Analyst
@ MainStreet Family Care | Birmingham, AL
Assistant/Associate Professor of the Practice in Business Analytics
@ Georgetown University McDonough School of Business | Washington DC
Senior Data Science Writer
@ NannyML | Remote
Director of AI/ML Engineering
@ Armis Industries | Remote (US only), St. Louis, California
Digital Analytics Manager
@ Patagonia | Ventura, California