Anthropic researchers: AI models can be trained to deceive and the most commonly used AI safety techniques had little to no effect on the deceptive behaviors (Kyle Wiggers/TechCrunch)

Jan. 13, 2024, 9:50 p.m. |

Kyle Wiggers / TechCrunch:

Anthropic researchers: AI models can be trained to deceive and the most commonly used AI safety techniques had little to no effect on the deceptive behaviors — Most humans learn the skill of deceiving other humans. So can AI models learn the same? Yes, the answer seems — and terrifyingly, they're exceptionally good at it.

ai models anthropic kyle researchers safety techcrunch

Visit resource

More from www.techmeme.com / Techmeme

OpenAI says ChatGPT can now directly import files from Google Drive and Microsoft OneDrive, available … 6 hours ago | www.techmeme.com

chatgpt drive enterprise files +9

Sources: Snowflake is in talks to acquire Reka AI, which builds LLMs for businesses, for … 7 hours ago | www.techmeme.com

bloomberg businesses funding funding round +6

Ann Arbor-based Voxel51, which is developing a visual AI platform to reduce the failure rate … 7 hours ago | www.techmeme.com

ai platform ai projects ann failure +10

Youth advocacy group Encode Justice unveils 22 policy recommendations to ensure AI protects the "lives, … 9 hours ago | www.techmeme.com

encode justice people policy +7

OpenAI and Reddit partner to bring Reddit content to ChatGPT and more via Reddit's Data … 13 hours ago | www.techmeme.com

ai tools api chatgpt data +6

AI coding startup Replit, which raised $220M+ and was last valued at $1B, lays off … 14 hours ago | www.techmeme.com

ai coding coding employees enterprise +6

Sigma Computing, which offers cloud data analytics tools, raised a $200M Series D at a … 16 hours ago | www.techmeme.com

analytics analytics tools cloud cloud data +13

Sony Music sends letters to 700+ AI companies, developers, and music streaming platforms warning over … 18 hours ago | www.techmeme.com

ai companies companies daniel developers +11

Server CPU designer Ampere announces that its AmpereOne chip family will grow to 256 cores … 18 hours ago | www.techmeme.com

ampere chip cloud cloud ai +10

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

all AI news

Anthropic researchers: AI models can be trained to deceive and the most commonly used AI safety techniques had little to no effect on the deceptive behaviors (Kyle Wiggers/TechCrunch)

More from www.techmeme.com / Techmeme

Jobs in AI, ML, Big Data

Software Engineer for AI Training Data (School Specific)

Software Engineer for AI Training Data (Python)

Software Engineer for AI Training Data (Tier 2)

Data Engineer

Artificial Intelligence – Bioinformatic Expert

Lead Developer (AI)