The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions | allainews.com

s

April 23, 2024, 3:36 a.m. |

Simon Willison's Weblog simonwillison.net

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

By far the most detailed paper on prompt injection I've seen yet from OpenAI, published a few days ago and with six credited authors: Eric Wallace, Kai Xiao, Reimar Leike, Lilian Weng, Johannes Heidecke and Alex Beutel.

The paper notes that prompt injection mitigations which completely refuse any form of instruction in an untrusted prompt may not actually be ideal: some forms of instruction are harmless, and refusing them may provide …

ai alex authors eric generativeai kai lilian weng llms notes openai paper prompt prompt injection promptinjection security six training training llms

More from simonwillison.net / Simon Willison's Weblog

Si

Bullying in Open Source Software Is a Massive Security Vulnerability 10 hours ago | simonwillison.net

backdoor contributor linux linux distributions +13

Si

experimental-phi3-webgpu 11 hours ago | simonwillison.net

ai browser browsers cache +20

Si

datasette-pins — a new Datasette plugin for pinning tables and queries 14 hours ago | simonwillison.net

alex alexgarcia cloud databases +11

Si

Quoting Nathaniel Borenstein 1 day, 13 hours ago | simonwillison.net

basic consent engineer ethics +5

Si

Slop is the new name for unwanted AI-generated content 1 day, 14 hours ago | simonwillison.net

ai ai generated ai-generated content art +11

Si

OpenAI Model Spec, May 2024 edition 1 day, 15 hours ago | simonwillison.net

ai api chatgpt core +10

Si

Modern SQLite: Generated columns 1 day, 16 hours ago | simonwillison.net

antonzhiyanov features generated modern +6

Si

Tagged Pointer Strings (2015) 1 day, 19 hours ago | simonwillison.net

embed implementation least macos +6

Si

Towards universal version control with Patchwork 2 days, 7 hours ago | simonwillison.net

ai applications beyond control +12

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net