April 28, 2022, 1:11 a.m. | Lukas Schulth, Christian Berghoff, Matthias Neu

cs.LG updates on arXiv.org arxiv.org

Predicitions made by neural networks can be fraudulently altered by so-called
poisoning attacks. A special case are backdoor poisoning attacks. We study
suitable detection methods and introduce a new method called Heatmap
Clustering. There, we apply a $k$-means clustering algorithm on heatmaps
produced by the state-of-the-art explainable AI method Layer-wise relevance
propagation. The goal is to separate poisoned from un-poisoned data in the
dataset. We compare this method with a similar method, called Activation
Clustering, which also uses $k$-means clustering …

arxiv attacks backdoor clustering heatmap networks neural networks

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote