all AI news
Speak Out of Turn: Safety Vulnerability of Large Language Models in Multi-turn Dialogue
Feb. 28, 2024, 5:49 a.m. | Zhenhong Zhou, Jiuyang Xiang, Haopeng Chen, Quan Liu, Zherui Li, Sen Su
cs.CL updates on arXiv.org arxiv.org
Abstract: Large Language Models (LLMs) have been demonstrated to generate illegal or unethical responses, particularly when subjected to "jailbreak." Research on jailbreak has highlighted the safety issues of LLMs. However, prior studies have predominantly focused on single-turn dialogue, ignoring the potential complexities and risks presented by multi-turn dialogue, a crucial mode through which humans derive information from LLMs. In this paper, we argue that humans could exploit multi-turn dialogue to induce LLMs into generating harmful information. …
abstract arxiv complexities cs.ai cs.cl dialogue generate jailbreak language language models large language large language models llms prior research responses safety speak studies type vulnerability
More from arxiv.org / cs.CL updates on arXiv.org
Benchmarking LLMs via Uncertainty Quantification
1 day, 13 hours ago |
arxiv.org
CARE: Extracting Experimental Findings From Clinical Literature
1 day, 13 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Global Data Architect, AVP - State Street Global Advisors
@ State Street | Boston, Massachusetts
Data Engineer
@ NTT DATA | Pune, MH, IN