Web: http://arxiv.org/abs/2209.07046

Sept. 16, 2022, 1:15 a.m. | Yi Li, Hualiang Wang, Yiqun Duan, Hang Xu, Xiaomeng Li

cs.CV updates on arXiv.org arxiv.org

Contrastive Language-Image pre-training (CLIP) learns rich representations
via readily available supervisions of natural language. It could improve
general performance on downstream vision tasks, including but not limited to
zero-shot, long tail, segmentation, retrieval, caption and video. However, to
the best of our knowledge, the visual interpretability of CLIP has not been
studied yet. To provide visual explanations of its predictions, we propose the
Image-Text Similarity Map (ITSM). Based on it, we surprisingly find that CLIP
prefers the background regions than …

arxiv image interpretability language pre-training training

More from arxiv.org / cs.CV updates on arXiv.org

Postdoctoral Fellow: ML for autonomous materials discovery

@ Lawrence Berkeley National Lab | Berkeley, CA

Research Scientists

@ ODU Research Foundation | Norfolk, Virginia

Embedded Systems Engineer (Robotics)

@ Neo Cybernetica | Bedford, New Hampshire

2023 Luis J. Alvarez and Admiral Grace M. Hopper Postdoc Fellowship in Computing Sciences

@ Lawrence Berkeley National Lab | San Francisco, CA

Senior Manager Data Scientist

@ NAV | Remote, US

Senior AI Research Scientist

@ Earth Species Project | Remote anywhere

Research Fellow- Center for Security and Emerging Technology (Multiple Opportunities)

@ University of California Davis | Washington, DC

Staff Fellow - Data Scientist

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Staff Fellow - Senior Data Engineer

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Research Engineer - VFX, Neural Compositing

@ Flawless | Los Angeles, California, United States

[Job-TB] Senior Data Engineer

@ CI&T | Brazil

Data Analytics Engineer

@ The Fork | Paris, France