all AI news
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game. (arXiv:2311.01011v1 [cs.LG])
cs.LG updates on arXiv.org arxiv.org
While Large Language Models (LLMs) are increasingly being used in real-world
applications, they remain vulnerable to prompt injection attacks: malicious
third party prompts that subvert the intent of the system designer. To help
researchers study this problem, we present a dataset of over 126,000 prompt
injection attacks and 46,000 prompt-based "defenses" against prompt injection,
all created by players of an online game called Tensor Trust. To the best of
our knowledge, this is currently the largest dataset of human-generated
adversarial …
applications arxiv attacks dataset designer game language language models large language large language models llms prompt prompt injection prompt injection attacks prompts researchers study tensor trust vulnerable world