all AI news
Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIP
March 28, 2024, 4:42 a.m. | Reza Abbasi, Mohammad Samiei, Mohammad Hossein Rohban, Mahdieh Soleymani Baghshah
cs.LG updates on arXiv.org arxiv.org
Abstract: Vision-language models, such as CLIP, have shown promising Out-of-Distribution (OoD) generalization under various types of distribution shifts. Recent studies attempted to investigate the leading cause of this capability. In this work, we follow the same path, but focus on a specific type of OoD data - images with novel compositions of attribute-object pairs - and study whether such models can successfully classify those images into composition classes. We carefully designed an authentic image test dataset …
abstract arxiv capability clip cs.cl cs.cv cs.lg distribution focus language language models object path pivotal role studies type types vision vision-language models work
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
AIML - Sr Machine Learning Engineer, Data and ML Innovation
@ Apple | Seattle, WA, United States
Senior Data Engineer
@ Palta | Palta Cyprus, Palta Warsaw, Palta remote