March 4, 2024, 5:43 a.m. | Jinghuai Zhang, Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong

cs.LG updates on arXiv.org arxiv.org

arXiv:2211.08229v5 Announce Type: replace-cross
Abstract: Contrastive learning (CL) pre-trains general-purpose encoders using an unlabeled pre-training dataset, which consists of images or image-text pairs. CL is vulnerable to data poisoning based backdoor attacks (DPBAs), in which an attacker injects poisoned inputs into the pre-training dataset so the encoder is backdoored. However, existing DPBAs achieve limited effectiveness. In this work, we take the first step to analyze the limitations of existing backdoor attacks and propose new DPBAs called CorruptEncoder to CL. CorruptEncoder …

abstract arxiv attacks backdoor cs.cr cs.cv cs.lg data data poisoning dataset encoder general image images inputs pre-training text training trains type vulnerable

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Data Scientist (Database Development)

@ Nasdaq | Bengaluru-Affluence