all AI news
Fixing confirmation bias in feature attribution methods via semantic match
Feb. 22, 2024, 5:42 a.m. | Giovanni Cin\`a, Daniel Fernandez-Llaneza, Nishant Mishra, Tabea E. R\"ober, Sandro Pezzelle, Iacer Calixto, Rob Goedhart, \c{S}. \.Ilker Birbil
cs.LG updates on arXiv.org arxiv.org
Abstract: Feature attribution methods have become a staple method to disentangle the complex behavior of black box models. Despite their success, some scholars have argued that such methods suffer from a serious flaw: they do not allow a reliable interpretation in terms of human concepts. Simply put, visualizing an array of feature contributions is not enough for humans to conclude something about a model's internal representations, and confirmation bias can trick users into false beliefs about …
abstract arxiv attribution become behavior bias black box box concepts confirmation bias cs.ai cs.lg feature human interpretation match scholars semantic success terms type via
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Engineer - AWS
@ 3Pillar Global | Costa Rica
Cost Controller/ Data Analyst - India
@ John Cockerill | Mumbai, India, India, India