all AI news
Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias. (arXiv:2210.02720v1 [cs.LG])
stat.ML updates on arXiv.org arxiv.org
Gradient regularization (GR) is a method that penalizes the gradient norm of
the training loss during training. Although some studies have reported that GR
improves generalization performance in deep learning, little attention has been
paid to it from the algorithmic perspective, that is, the algorithms of GR that
efficiently improve performance. In this study, we first reveal that a specific
finite-difference computation, composed of both gradient ascent and descent
steps, reduces the computational cost for GR. In addition, this computation …
arxiv bias computation deep learning difference gradient regularization understanding