The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions | allainews.com

s

April 23, 2024, 3:36 a.m. |

Simon Willison's Weblog simonwillison.net

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

By far the most detailed paper on prompt injection I've seen yet from OpenAI, published a few days ago and with six credited authors: Eric Wallace, Kai Xiao, Reimar Leike, Lilian Weng, Johannes Heidecke and Alex Beutel.

The paper notes that prompt injection mitigations which completely refuse any form of instruction in an untrusted prompt may not actually be ideal: some forms of instruction are harmless, and refusing them may provide …

ai alex authors eric generativeai kai lilian weng llms notes openai paper prompt prompt injection promptinjection security six training training llms

More from simonwillison.net / Simon Willison's Weblog

Si

I'm writing a new vector search SQLite Extension 14 hours ago | simonwillison.net

alex alexgarcia dependencies embeddings +14

Si

Quoting Zach Seward 21 hours ago | simonwillison.net

advances ai attention bias +14

Si

Printing music with CSS Grid 1 day, 3 hours ago | simonwillison.net

application bond column css +10

Si

We can have a different web 1 day, 15 hours ago | simonwillison.net

audio dog headphones mollywhite +2

Si

Quoting Tom Eastman 1 day, 15 hours ago | simonwillison.net

five internet remember when text +2

Si

Llama 3 prompt formats 1 day, 23 hours ago | simonwillison.net

ai clear documentation every +12

Si

Introducing the Claude Team plan and iOS app 2 days, 1 hour ago | simonwillison.net

access anthropic app claude +11

Si

Save the Web by Being Nice 2 days, 15 hours ago | simonwillison.net

andrew article blog blogging +6

Si

Quoting LMSYS 2 days, 21 hours ago | simonwillison.net

ai api commercial community +9

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Data Engineer (m/f/d)

@ Project A Ventures | Berlin, Germany

View on ai-jobs.net

Principle Research Scientist

@ Analog Devices | US, MA, Boston

View on ai-jobs.net