Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game. (arXiv:2311.01011v1 [cs.LG]) | allainews.com

Nov. 5, 2023, 6:42 a.m. | Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrel

cs.LG updates on arXiv.org arxiv.org

While Large Language Models (LLMs) are increasingly being used in real-world
applications, they remain vulnerable to prompt injection attacks: malicious
third party prompts that subvert the intent of the system designer. To help
researchers study this problem, we present a dataset of over 126,000 prompt
injection attacks and 46,000 prompt-based "defenses" against prompt injection,
all created by players of an online game called Tensor Trust. To the best of
our knowledge, this is currently the largest dataset of human-generated
adversarial …

applications arxiv attacks dataset designer game language language models large language large language models llms prompt prompt injection prompt injection attacks prompts researchers study tensor trust vulnerable world

More from arxiv.org / cs.LG updates on arXiv.org

Course Recommender Systems Need to Consider the Job Market 21 hours ago | arxiv.org

abstract arxiv course cs.ir +16

$\texttt{immrax}$: A Parallelizable and Differentiable Toolbox for Interval Analysis and Mixed Monotone Reachability in JAX 21 hours ago | arxiv.org

abstract analysis arxiv compilation +18

Thousands of AI Authors on the Future of AI 21 hours ago | arxiv.org

abstract advanced advanced ai ai progress +21

Graphene: Infrastructure Security Posture Analysis with AI-generated Attack Graphs 21 hours ago | arxiv.org

abstract analysis arxiv assessment +24

Volume-Preserving Transformers for Learning Time Series Data with Structure 21 hours ago | arxiv.org

abstract arxiv cs.lg cs.na +24

Eureka: Human-Level Reward Design via Coding Large Language Models 21 hours ago | arxiv.org

abstract algorithm arxiv bridge +25

Reconstruction of Unstable Heavy Particles Using Deep Symmetry-Preserving Attention Networks 21 hours ago | arxiv.org

abstract arxiv attention cs.lg +11

FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search 21 hours ago | arxiv.org

abstract arxiv become compression +24

Gaussian random field approximation via Stein's method with applications to wide random neural networks 21 hours ago | arxiv.org

abstract applications approximation arxiv +14

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Analyst (Digital Business Analyst)

@ Activate Interactive Pte Ltd | Singapore, Central Singapore, Singapore

View on ai-jobs.net