all AI news
Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks
June 5, 2024, 4:43 a.m. | Tianyu He, Darshil Doshi, Aritra Das, Andrey Gromov
cs.LG updates on arXiv.org arxiv.org
Abstract: Large language models can solve tasks that were not present in the training set. This capability is believed to be due to in-context learning and skill composition. In this work, we study the emergence of in-context learning and skill composition in a collection of modular arithmetic tasks. Specifically, we consider a finite collection of linear modular functions $z = a \, x + b \, y \;\mathrm{mod}\; p$ labeled by the vector $(a, b) \in …
abstract arxiv capability cond-mat.dis-nn context context learning cs.lg emergence grok hep-th in-context learning language language models large language large language models modular set skill solve stat.ml study tasks training type work
More from arxiv.org / cs.LG updates on arXiv.org
MixerFlow: MLP-Mixer meets Normalising Flows
2 days, 10 hours ago |
arxiv.org
Machine Learning-Enabled Software and System Architecture Frameworks
2 days, 10 hours ago |
arxiv.org
Kernelised Normalising Flows
2 days, 10 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Scientist
@ Ford Motor Company | Chennai, Tamil Nadu, India
Systems Software Engineer, Graphics
@ Parallelz | Vancouver, British Columbia, Canada - Remote
Engineering Manager - Geo Engineering Team (F/H/X)
@ AVIV Group | Paris, France
Data Analyst
@ Microsoft | San Antonio, Texas, United States
Azure Data Engineer
@ TechVedika | Hyderabad, India
Senior Data & AI Threat Detection Researcher (Cortex)
@ Palo Alto Networks | Tel Aviv-Yafo, Israel