all AI news
MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning
March 12, 2024, 4:52 a.m. | Yichuan Li, Xiyao Ma, Sixing Lu, Kyumin Lee, Xiaohu Liu, Chenlei Guo
cs.CL updates on arXiv.org arxiv.org
Abstract: Large Language models (LLMs) have demonstrated impressive in-context learning (ICL) capabilities, where a LLM makes predictions for a given test input together with a few input-output pairs (demonstrations). Nevertheless, the inclusion of demonstrations leads to a quadratic increase in the computational overhead of the self-attention mechanism. Existing solutions attempt to distill lengthy demonstrations into compact vectors. However, they often require task-specific retraining or compromise LLM's in-context learning performance. To mitigate these challenges, we present Meta …
abstract arxiv attention capabilities computational context cs.ai cs.cl distillation inclusion in-context learning input-output language language models large language large language models leads llm llms meta predictions self-attention test together type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Data Engineer
@ Kaseya | Bengaluru, Karnataka, India