April 6, 2024, 7:42 p.m. | /u/claren0

Machine Learning www.reddit.com

What are some good books for mechanistic interpretability in machine learning? I'm struggling to find a good book that I can read on this topic. I currently do research in optimization and would like to learn more about internal representations, interventions, and mechanistic interpretability in AI models.

ai models book books good interpretability learn learn more machine machine learning machinelearning optimization research resources

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne