Undersampling is a Minimax Optimal Robustness Intervention in Nonparametric Classification. (arXiv:2205.13094v1 [cs.LG]) | allainews.com

May 27, 2022, 1:11 a.m. | Niladri S. Chatterji, Saminul Haque, Tatsunori Hashimoto

stat.ML updates on arXiv.org arxiv.org

While a broad range of techniques have been proposed to tackle distribution
shift, the simple baseline of training on an $\textit{undersampled}$ dataset
often achieves close to state-of-the-art-accuracy across several popular
benchmarks. This is rather surprising, since undersampling algorithms discard
excess majority group data. To understand this phenomenon, we ask if learning
is fundamentally constrained by a lack of minority group samples. We prove that
this is indeed the case in the setting of nonparametric binary classification.
Our results show that …

arxiv classification minimax robustness

More from arxiv.org / stat.ML updates on arXiv.org

Multi-Study R-Learner for Estimating Heterogeneous Treatment Effects Across Studies Using Statistical Machine Learning 8 hours ago | arxiv.org

abstract arxiv effects machine +15

Spatial best linear unbiased prediction: A computational mathematics approach for high dimensional massive datasets 8 hours ago | arxiv.org

abstract arxiv challenges classification +20

Estimation Sample Complexity of a Class of Nonlinear Continuous-time Systems 1 day, 23 hours ago | arxiv.org

abstract arxiv class complexity +14

Estimation and Uniform Inference in Sparse High-Dimensional Additive Models 1 day, 23 hours ago | arxiv.org

abstract arxiv confidence construct +9

GIST: Gibbs self-tuning for locally adaptive Hamiltonian Monte Carlo 1 day, 23 hours ago | arxiv.org

abstract algorithm arxiv framework +13

Variational Bayesian surrogate modelling with application to robust design optimisation 1 day, 23 hours ago | arxiv.org

abstract application approximation arxiv +20

Corrected generalized cross-validation for finite ensembles of penalized estimators 2 days, 23 hours ago | arxiv.org

abstract arxiv error freedom +13

Statistical Inference for Heterogeneous Treatment Effects Discovered by Generic Machine Learning in Randomized Experiments 2 days, 23 hours ago | arxiv.org

abstract algorithms arxiv causal +15

Asymptotic Validity and Finite-Sample Properties of Approximate Randomization Tests 2 days, 23 hours ago | arxiv.org

abstract arxiv data distribution +11

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Management Assistant

@ World Vision | Amman Office, Jordan

View on ai-jobs.net

Cloud Data Engineer, Global Services Delivery, Google Cloud

@ Google | Buenos Aires, Argentina

View on ai-jobs.net