Feb. 23, 2024, 5:43 a.m. | Ramnath Kumar, Kushal Majmundar, Dheeraj Nagaraj, Arun Sai Suggala

cs.LG updates on arXiv.org arxiv.org

arXiv:2306.09222v3 Announce Type: replace
Abstract: We present Re-weighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample importance weighting. Our method is grounded in the principles of distributionally robust optimization (DRO) with Kullback-Leibler divergence. RGD is simple to implement, computationally efficient, and compatible with widely used optimizers such as SGD and Adam. We demonstrate the broad applicability and impact of RGD by achieving state-of-the-art results on diverse benchmarks, including improvements of …

abstract arxiv cs.ai cs.lg divergence dynamic gradient importance networks neural networks novel optimization performance robust sample simple stochastic through type via

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne