Feb. 20, 2024, 5:43 a.m. | Myung Gyo Oh, Hong Eun Ahn, Leo Hyun Park, Taekyoung Kwon

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.12189v1 Announce Type: cross
Abstract: Neural language models (LMs) are vulnerable to training data extraction attacks due to data memorization. This paper introduces a novel attack scenario wherein an attacker adversarially fine-tunes pre-trained LMs to amplify the exposure of the original training data. This strategy differs from prior studies by aiming to intensify the LM's retention of its pre-training dataset. To achieve this, the attacker needs to collect generated texts that are closely aligned with the pre-training data. However, without …

abstract amplify arxiv attacks cs.cl cs.cr cs.lg data data extraction extraction fine-tuning language language models lms novel paper prior strategy studies through training training data type vulnerable

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst (Digital Business Analyst)

@ Activate Interactive Pte Ltd | Singapore, Central Singapore, Singapore