April 2, 2024, 7:45 p.m. | Gilad Deutch, Nadav Magar, Tomer Bar Natan, Guy Dar

cs.LG updates on arXiv.org arxiv.org

arXiv:2311.07772v4 Announce Type: replace-cross
Abstract: In-context learning (ICL) has shown impressive results in few-shot learning tasks, yet its underlying mechanism is still not fully understood. A recent line of work suggests that ICL performs gradient descent (GD)-based optimization implicitly. While appealing, much of the research focuses on simplified settings, where the parameters of a shallow model are optimized. In this work, we revisit evidence for ICL-GD correspondence on realistic NLP tasks and models. We find gaps in evaluation, both in …

abstract arxiv context cs.cl cs.lg few-shot few-shot learning gradient in-context learning line optimization parameters research results simplified tasks type work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Data Engineer

@ Quantexa | Sydney, New South Wales, Australia

Staff Analytics Engineer

@ Warner Bros. Discovery | NY New York 230 Park Avenue South