Oct. 31, 2022, 1:12 a.m. | Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski

cs.LG updates on arXiv.org arxiv.org

A common explanation for the failure of deep networks to generalize
out-of-distribution is that they fail to recover the "correct" features. We
challenge this notion with a simple experiment which suggests that ERM already
learns sufficient features and that the current bottleneck is not feature
learning, but robust regression. Our findings also imply that given a small
amount of data from the target distribution, retraining only the last linear
layer will give excellent performance. We therefore argue that devising simpler …

arxiv distribution erm features learn regression

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

DevOps Engineer (Data Team)

@ Reward Gateway | Sofia/Plovdiv