all AI news
To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets
March 6, 2024, 5:43 a.m. | Darshil Doshi, Aritra Das, Tianyu He, Andrey Gromov
cs.LG updates on arXiv.org arxiv.org
Abstract: Robust generalization is a major challenge in deep learning, particularly when the number of trainable parameters is very large. In general, it is very difficult to know if the network has memorized a particular set of examples or understood the underlying rule (or both). Motivated by this challenge, we study an interpretable model where generalizing representations are understood analytically, and are easily distinguishable from the memorizing ones. Namely, we consider multi-layer perceptron (MLP) and Transformer …
abstract arxiv challenge cond-mat.dis-nn cs.lg datasets deep learning examples general grok major network parameters robust set stat.ml type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York