Aug. 12, 2022, 1:11 a.m. | John Duchi, Tatsunori Hashimoto, Hongseok Namkoong

stat.ML updates on arXiv.org arxiv.org

While modern large-scale datasets often consist of heterogeneous
subpopulations -- for example, multiple demographic groups or multiple text
corpora -- the standard practice of minimizing average loss fails to guarantee
uniformly low losses across all subpopulations. We propose a convex procedure
that controls the worst-case performance over all subpopulations of a given
size. Our procedure comes with finite-sample (nonparametric) convergence
guarantees on the worst-off subpopulation. Empirically, we observe on lexical
similarity, wine quality, and recidivism prediction tasks that our worst-case …

arxiv lg losses

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Automated Greenhouse Expert - Phenotyping & Data Analysis (all genders)

@ Bayer | Frankfurt a.M., Hessen, DE

Machine Learning Scientist II

@ Expedia Group | India - Bengaluru

Data Engineer/Senior Data Engineer, Bioinformatics

@ Flagship Pioneering, Inc. | Cambridge, MA USA

Intern (AI lab)

@ UL Solutions | Dublin, Co. Dublin, Ireland

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States