all AI news
Quadratic models for understanding catapult dynamics of neural networks
May 3, 2024, 4:54 a.m. | Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin
cs.LG updates on arXiv.org arxiv.org
Abstract: While neural networks can be approximated by linear models as their width increases, certain properties of wide neural networks cannot be captured by linear models. In this work we show that recently proposed Neural Quadratic Models can exhibit the "catapult phase" [Lewkowycz et al. 2020] that arises when training such models with large learning rates. We then empirically show that the behaviour of neural quadratic models parallels that of neural networks in generalization, especially in …
abstract arxiv cs.lg dynamics linear math.oc networks neural networks show stat.ml type understanding while work
More from arxiv.org / cs.LG updates on arXiv.org
Testing the Segment Anything Model on radiology data
1 day, 11 hours ago |
arxiv.org
Calorimeter shower superresolution
1 day, 11 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US