all AI news
In-Context Language Learning: Architectures and Algorithms. (arXiv:2401.12973v2 [cs.CL] UPDATED)
cs.CL updates on arXiv.org arxiv.org
Large-scale neural language models exhibit a remarkable capacity for
in-context learning (ICL): they can infer novel functions from datasets
provided as input. Most of our current understanding of when and how ICL arises
comes from LMs trained on extremely simple learning problems like linear
regression and associative recall. There remains a significant gap between
these model problems and the "real" ICL exhibited by LMs trained on large text
corpora, which involves not just retrieval and function approximation but
free-form generation …
algorithms architectures arxiv capacity context cs.cl current datasets functions in-context learning language language models linear linear regression lms novel recall regression scale simple understanding