March 1, 2024, 5:47 a.m. | Sibo Wang, Jie Zhang, Zheng Yuan, Shiguang Shan

cs.CV updates on arXiv.org arxiv.org

arXiv:2401.04350v2 Announce Type: replace
Abstract: Large-scale pre-trained vision-language models like CLIP have demonstrated impressive performance across various tasks, and exhibit remarkable zero-shot generalization capability, while they are also vulnerable to imperceptible adversarial examples. Existing works typically employ adversarial training (fine-tuning) as a defense method against adversarial examples. However, direct application to the CLIP model may result in overfitting, compromising the model's capacity for generalization. In this paper, we propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) method, which leverages supervision from …

abstract adversarial adversarial examples adversarial training application arxiv capability clip cs.cv defense examples fine-tuning language language models performance robustness scale tasks training type vision vision-language models vulnerable zero-shot

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US