all AI news
Leveraging the Context through Multi-Round Interactions for Jailbreaking Attacks
Feb. 15, 2024, 5:42 a.m. | Yixin Cheng, Markos Georgopoulos, Volkan Cevher, Grigorios G. Chrysos
cs.LG updates on arXiv.org arxiv.org
Abstract: Large Language Models (LLMs) are susceptible to Jailbreaking attacks, which aim to extract harmful information by subtly modifying the attack query. As defense mechanisms evolve, directly obtaining harmful information becomes increasingly challenging for Jailbreaking attacks. In this work, inspired by human practices of indirect context to elicit harmful information, we focus on a new attack form called Contextual Interaction Attack. The idea relies on the autoregressive nature of the generation process in LLMs. We contend …
abstract aim arxiv attacks context cs.ai cs.cl cs.lg defense extract human information interactions jailbreaking language language models large language large language models llms practices query through type work
More from arxiv.org / cs.LG updates on arXiv.org
Efficient Data-Driven MPC for Demand Response of Commercial Buildings
2 days, 16 hours ago |
arxiv.org
Testing the Segment Anything Model on radiology data
2 days, 16 hours ago |
arxiv.org
Calorimeter shower superresolution
2 days, 16 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US