all AI news
When and why vision-language models behave like bags-of-words, and what to do about it?. (arXiv:2210.01936v2 [cs.CV] UPDATED)
Oct. 7, 2022, 1:17 a.m. | Mert Yuksekgonul, Federico Bianchi, Pratyusha Kalluri, Dan Jurafsky, James Zou
cs.CL updates on arXiv.org arxiv.org
Despite the success of large vision and language models (VLMs) in many
downstream applications, it is unclear how well they encode compositional
information. Here, we create the Attribution, Relation, and Order (ARO)
benchmark to systematically evaluate the ability of VLMs to understand
different types of relationships, attributes, and order. ARO consists of Visual
Genome Attribution, to test the understanding of objects' properties; Visual
Genome Relation, to test for relational understanding; and COCO &
Flickr30k-Order, to test for order sensitivity. ARO …
More from arxiv.org / cs.CL updates on arXiv.org
VAL: Interactive Task Learning with GPT Dialog Parsing
1 day, 7 hours ago |
arxiv.org
DBCopilot: Scaling Natural Language Querying to Massive Databases
1 day, 7 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Management Associate
@ EcoVadis | Ebène, Mauritius
Senior Data Engineer
@ Telstra | Telstra ICC Bengaluru