Trained Transformers Learn Linear Models In-Context | allainews.com

Jan. 1, 2024, midnight | Ruiqi Zhang, Spencer Frei, Peter L. Bartlett

JMLR www.jmlr.org

Attention-based neural networks such as transformers have demonstrated a remarkable ability to exhibit in-context learning (ICL): Given a short prompt sequence of tokens from an unseen task, they can formulate relevant per-token and next-token predictions without any parameter updates. By embedding a sequence of labeled training data and unlabeled test data as a prompt, this allows for transformers to behave like supervised learning algorithms. Indeed, recent work has shown that when training transformer architectures over random instances of linear regression …

attention context data embedding in-context learning learn linear networks neural networks next per predictions prompt test token tokens training training data transformers updates

More from www.jmlr.org / JMLR

Functions with average smoothness: structure, algorithms, and learning 5 months, 4 weeks ago | www.jmlr.org

algorithms analysis complexity function +4

Generative Adversarial Ranking Nets 5 months, 4 weeks ago | www.jmlr.org

Predictive Inference with Weak Supervision 5 months, 4 weeks ago | www.jmlr.org

bridge confidence data framework +12

Deep Network Approximation: Beyond ReLU to Diverse Activation Functions 5 months, 4 weeks ago | www.jmlr.org

approximation beyond diverse function +10

Model-Free Representation Learning and Exploration in Low-Rank MDPs 5 months, 4 weeks ago | www.jmlr.org

algorithms contrast dynamics exploration +9

Effect-Invariant Mechanisms for Policy Generalization 5 months, 4 weeks ago | www.jmlr.org

adapt challenge environments exploit +7

Pygmtools: A Python Graph Matching Toolkit 5 months, 4 weeks ago | www.jmlr.org

applications collection free graph +8

Power of knockoff: The impact of ranking algorithm, augmented design, and symmetric statistic 5 months, 4 weeks ago | www.jmlr.org

algorithm components control design +11

Heterogeneous-Agent Reinforcement Learning 5 months, 4 weeks ago | www.jmlr.org

agent agents ai research convergence +10

Data Scientist

@ Ford Motor Company | Chennai, Tamil Nadu, India

View on ai-jobs.net

Systems Software Engineer, Graphics

@ Parallelz | Vancouver, British Columbia, Canada - Remote

View on ai-jobs.net

Engineering Manager - Geo Engineering Team (F/H/X)

@ AVIV Group | Paris, France

View on ai-jobs.net

Data Analyst

@ Microsoft | San Antonio, Texas, United States

View on ai-jobs.net

Azure Data Engineer

@ TechVedika | Hyderabad, India

View on ai-jobs.net

Senior Data & AI Threat Detection Researcher (Cortex)

@ Palo Alto Networks | Tel Aviv-Yafo, Israel

View on ai-jobs.net