April 2, 2024, 11:05 p.m. |

Techmeme www.techmeme.com


Devin Coldewey / TechCrunch:

Anthropic researchers detail “many-shot jailbreaking”, which can evade LLMs' safety guardrails by including a large number of faux dialogues in a single prompt  —  How do you get an AI to answer a question it's not supposed to?  There are many such “jailbreak” techniques …

anthropic devin guardrails jailbreaking llms prompt researchers safety techcrunch

More from www.techmeme.com / Techmeme

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Data Scientist 3

@ Wyetech | Annapolis Junction, Maryland

Technical Program Manager, Robotics

@ DeepMind | Mountain View, California, US

Machine Learning Engineer

@ Issuu | Braga

Business Intelligence Manager

@ Intuitive | Bengaluru, India

Expert Data Engineer (m/w/d)

@ REWE International Dienstleistungsgesellschaft m.b.H | Wien, Austria