all AI news
LLMs May Learn Deceptive Behavior and Act as Persistent Sleeper Agents
Jan. 20, 2024, 4 p.m. | Sergio De Simone
InfoQ - AI, ML & Data Engineering www.infoq.com
AI researchers at OpenAI competitor Anthropic trained proof-of-concept LLMs showing deceptive behavior triggered by specific hints in the prompts. Furthermore, they say, once deceptive behavior was trained into the model, there was no way to circumvent it using standard techniques.
By Sergio De Simoneact agents ai ai researchers anthropic behavior concept large language models learn llms ml & data engineering openai openai competitor prompts proof-of-concept researchers security sleeper agents standard
More from www.infoq.com / InfoQ - AI, ML & Data Engineering
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US