all AI news
Interpretability is a Kind of Safety: An Interpreter-based Ensemble for Adversary Defense. (arXiv:2304.06919v1 [cs.LG])
cs.LG updates on arXiv.org arxiv.org
While having achieved great success in rich real-life applications, deep
neural network (DNN) models have long been criticized for their vulnerability
to adversarial attacks. Tremendous research efforts have been dedicated to
mitigating the threats of adversarial attacks, but the essential trait of
adversarial examples is not yet clear, and most existing methods are yet
vulnerable to hybrid attacks and suffer from counterattacks. In light of this,
in this paper, we first reveal a gradient-based correlation between sensitivity
analysis-based DNN interpreters …
adversarial attacks analysis applications arxiv attacks correlation deep neural network defense dnn ensemble examples gradient hybrid interpretability interpreters kind life light network neural network paper process research safety success threats vulnerability vulnerable