Red Teaming Language Models with Language Models | allainews.com

w

Feb. 7, 2022, midnight |

DeepMind Blog www.deepmind.com

In our recent paper, we show that it is possible to automatically find inputs that elicit harmful text from language models by generating inputs using language models themselves. Our approach provides one tool for finding harmful model behaviours before users are impacted, though we emphasize that it should be viewed as one component alongside many other techniques that will be needed to find harms and mitigate them once found.

language language models paper red teaming show text tool

More from www.deepmind.com / DeepMind Blog

De

AlphaFold 3 predicts the structure and interactions of all of life’s molecules 5 days, 1 hour ago | www.deepmind.com

ai model alphafold deepmind google +6

De

Google DeepMind at ICLR 2024 1 week, 3 days ago | www.deepmind.com

agents ai agents deepmind foundational +8

De

The ethics of advanced AI assistants 3 weeks, 3 days ago | www.deepmind.com

advanced advanced ai ai assistants assistants +3

De

TacticAI: an AI assistant for football tactics 1 month, 3 weeks ago | www.deepmind.com

ai assistant ai system assistant collaboration +3

De

SIMA generalist AI agent for 3D virtual environments 2 months ago | www.deepmind.com

agent environments scalable sima +2

De

Gemma: Introducing new state-of-the-art open models 2 months, 3 weeks ago | www.deepmind.com

ai development art development gemini +5

De

Our next-generation model: Gemini 1.5 2 months, 4 weeks ago | www.deepmind.com

context gemini next performance +1

De

The next chapter of our Gemini era 3 months ago | www.deepmind.com

gemini google next products

De

AlphaGeometry: An Olympiad-level AI system for geometry 3 months, 3 weeks ago | www.deepmind.com

ai reasoning ai system alphageometry geometry +4

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net