Feb. 29, 2024, 5:48 a.m. | Yue Deng, Wenxuan Zhang, Sinno Jialin Pan, Lidong Bing

cs.CL updates on arXiv.org arxiv.org

arXiv:2310.06474v2 Announce Type: replace
Abstract: While large language models (LLMs) exhibit remarkable capabilities across a wide range of tasks, they pose potential safety concerns, such as the ``jailbreak'' problem, wherein malicious instructions can manipulate LLMs to exhibit undesirable behavior. Although several preventive measures have been developed to mitigate the potential risks associated with LLMs, they have primarily focused on English. In this study, we reveal the presence of multilingual jailbreak challenges within LLMs and consider two potential risky scenarios: unintentional …

arxiv challenges cs.cl jailbreak language language models large language large language models multilingual type

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Director, Global Success Business Intelligence

@ Salesforce | Texas - Austin

Deep Learning Compiler Engineer - MLIR

@ NVIDIA | US, CA, Santa Clara

Commerce Data Engineer (Remote)

@ CrowdStrike | USA TX Remote