all AI news
Trustworthy Actionable Perturbations
May 21, 2024, 4:41 a.m. | Jesse Friedbaum, Sudarshan Adiga, Ravi Tandon
cs.LG updates on arXiv.org arxiv.org
Abstract: Counterfactuals, or modified inputs that lead to a different outcome, are an important tool for understanding the logic used by machine learning classifiers and how to change an undesirable classification. Even if a counterfactual changes a classifier's decision, however, it may not affect the true underlying class probabilities, i.e. the counterfactual may act like an adversarial attack and ``fool'' the classifier. We propose a new framework for creating modified inputs that change the true underlying …
abstract arxiv change class classification classifier classifiers counterfactual cs.ai cs.it cs.lg decision however inputs logic machine machine learning math.it tool true trustworthy type understanding
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Data Engineer
@ Displate | Warsaw
Senior Principal Applied Scientist
@ Oracle | United States
ML Science Manager
@ Abridge | United States-Remote
Machine Learning Engineer
@ XOPS | Pleasanton, California, United States
Senior/Staff Machine Learning Engineer
@ Firsthand | New York
Machine Learning (Pre-Sales) Solutions Engineer - EMEA
@ Snorkel AI | London, England