s
Jan. 6, 2024, 4:08 a.m. |

Simon Willison's Weblog simonwillison.net

Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations


NIST - the National Institute of Standards and Technology, a US government agency, released a 106 page report on attacks against modern machine learning models, mostly covering LLMs.

Prompt injection gets two whole sections, one on direct prompt injection (which incorporates jailbreaking as well, which they misclassify as a subset of prompt injection) and one on indirect prompt injection.

They talk a little bit about mitigations, but for both …

adversarial adversarial machine learning agency ai attacks generativeai government institute llms machine machine learning machine learning models modern nist page prompt prompt injection promptinjection report standards taxonomy technology terminology

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Data Scientist

@ ITE Management | New York City, United States