Feb. 16, 2024, 5:42 a.m. | Eliahu Horwitz, Jonathan Kahana, Yedid Hoshen

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.10208v1 Announce Type: new
Abstract: The dominant paradigm in generative modeling consists of two steps: i) pre-training on a large-scale but unsafe dataset, ii) aligning the pre-trained model with human values via fine-tuning. This practice is considered safe, as no current method can recover the unsafe, pre-fine-tuning model weights. In this paper, we demonstrate that this assumption is often false. Concretely, we present Spectral DeTuning, a method that can recover the weights of the pre-fine-tuning model using a few low-rank …

abstract arxiv cs.cl cs.cr cs.cv cs.lg current dataset fine-tuning generative generative modeling generative models human modeling paper paradigm practice pre-training scale training type values via

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US