March 20, 2024, 4:45 a.m. | Anjun Hu, Jindong Gu, Francesco Pinto, Konstantinos Kamnitsas, Philip Torr

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.12693v1 Announce Type: new
Abstract: Foundation models pre-trained on web-scale vision-language data, such as CLIP, are widely used as cornerstones of powerful machine learning systems. While pre-training offers clear advantages for downstream learning, it also endows downstream models with shared adversarial vulnerabilities that can be easily identified through the open-sourced foundation model. In this work, we expose such vulnerabilities in CLIP's downstream models and show that foundation models can serve as a basis for attacking their downstream systems. In particular, …

abstract advantages adversarial adversarial examples arxiv clear clip cs.cv data examples foundation language language data learning systems machine machine learning pre-training scale systems tasks training type vision vulnerabilities web

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Research Scientist

@ Meta | Menlo Park, CA

Principal Data Scientist

@ Mastercard | O'Fallon, Missouri (Main Campus)