April 6, 2024, 7:42 p.m. | /u/claren0

Machine Learning www.reddit.com

What are some good books for mechanistic interpretability in machine learning? I'm struggling to find a good book that I can read on this topic. I currently do research in optimization and would like to learn more about internal representations, interventions, and mechanistic interpretability in AI models.

ai models book books good interpretability learn learn more machine machine learning machinelearning optimization research resources

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York