all AI news
Anthropic researchers wear down AI ethics with repeated questions
April 2, 2024, 8:33 p.m. | Devin Coldewey
TechCrunch techcrunch.com
How do you get an AI to answer a question it’s not supposed to? There are many such “jailbreak” techniques, and Anthropic researchers just found a new one, in which a large language model can be convinced to tell you how to build a bomb if you prime it with a few dozen less-harmful questions […]
© 2024 TechCrunch. All rights reserved. For personal use only.
ai ai ethics anthropic build ethics found jailbreak language language model large language large language model prime question questions researchers
More from techcrunch.com / TechCrunch
U.K. agency releases tools to test AI model safety
1 day, 18 hours ago |
techcrunch.com
At the AI Film Festival, humanity triumphed over tech
1 day, 20 hours ago |
techcrunch.com
OpenAI’s ChatGPT announcement: What we know so far
2 days, 16 hours ago |
techcrunch.com
Anthropic’s Claude sees tepid reception on iOS compared with ChatGPT’s debut
2 days, 18 hours ago |
techcrunch.com
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York