Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets. (arXiv:2201.02177v1 [cs.LG]) | allainews.com

Jan. 7, 2022, 2:10 a.m. | Alethea Power, Yuri Burda, Harri Edwards, Igor Babuschkin, Vedant Misra

cs.LG updates on arXiv.org arxiv.org

In this paper we propose to study generalization of neural networks on small
algorithmically generated datasets. In this setting, questions about data
efficiency, memorization, generalization, and speed of learning can be studied
in great detail. In some situations we show that neural networks learn through
a process of "grokking" a pattern in the data, improving generalization
performance from random chance level to perfect generalization, and that this
improvement in generalization can happen well past the point of overfitting. We
also …

arxiv datasets overfitting small

More from arxiv.org / cs.LG updates on arXiv.org

Training towards significance with the decorrelated event classifier transformer neural network 15 hours ago | arxiv.org

abstract analysis application arxiv +28

An adaptive standardisation methodology for Day-Ahead electricity price forecasting 15 hours ago | arxiv.org

abstract algorithms arxiv complexity +18

SYNAuG: Exploiting Synthetic Data for Data Imbalance Problems 15 hours ago | arxiv.org

abstract arxiv cs.cv cs.lg +17

Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination methods 15 hours ago | arxiv.org

abstract algorithms arxiv augmentation +17

Description-Based Text Similarity 15 hours ago | arxiv.org

abstract arxiv cases cs.cl +14

Improving Gradient Methods via Coordinate Transformations: Applications to Quantum Machine Learning 15 hours ago | arxiv.org

abstract algorithms applications arxiv +13

A Generative Framework for Low-Cost Result Validation of Machine Learning-as-a-Service Inference 15 hours ago | arxiv.org

abstract applications arxiv as-a-service +26

Digital Over-the-Air Federated Learning in Multi-Antenna Systems 15 hours ago | arxiv.org

abstract arxiv communication computation +16

Bagging Provides Assumption-free Stability 15 hours ago | arxiv.org

abstract algorithm arxiv assumptions +15

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Machine Learning Engineer (m/f/d)

@ StepStone Group | Düsseldorf, Germany

View on ai-jobs.net

2024 GDIA AI/ML Scientist - Supplemental

@ Ford Motor Company | United States

View on ai-jobs.net