all AI news
Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization
Feb. 23, 2024, 5:43 a.m. | Ramnath Kumar, Kushal Majmundar, Dheeraj Nagaraj, Arun Sai Suggala
cs.LG updates on arXiv.org arxiv.org
Abstract: We present Re-weighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample importance weighting. Our method is grounded in the principles of distributionally robust optimization (DRO) with Kullback-Leibler divergence. RGD is simple to implement, computationally efficient, and compatible with widely used optimizers such as SGD and Adam. We demonstrate the broad applicability and impact of RGD by achieving state-of-the-art results on diverse benchmarks, including improvements of …
abstract arxiv cs.ai cs.lg divergence dynamic gradient importance networks neural networks novel optimization performance robust sample simple stochastic through type via
More from arxiv.org / cs.LG updates on arXiv.org
The Perception-Robustness Tradeoff in Deterministic Image Restoration
2 days, 17 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne