GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation | allainews.com

April 1, 2024, 4:47 a.m. | Mohsen Gholami, Mohammad Akbari, Cindy Hu, Vaden Masrani, Z. Jane Wang, Yong Zhang

cs.CL updates on arXiv.org arxiv.org

arXiv:2403.19754v1 Announce Type: new
Abstract: Knowledge distillation from LLMs is essential for the efficient deployment of language models. Prior works have proposed data generation using LLMs for preparing distilled models. We argue that generating data with LLMs is prone to sampling mainly from the center of original content distribution. This limitation hinders the distilled model from learning the true underlying data distribution and to forget the tails of the distributions (samples with lower probability). To this end, we propose GOLD, …

abstract arxiv center cs.cl data deployment distillation distribution generalized knowledge language language data language models llms prior sampling type via

More from arxiv.org / cs.CL updates on arXiv.org

Sparse is Enough in Fine-tuning Pre-trained Large Language Models 12 hours ago | arxiv.org

arxiv cs.ai cs.cl cs.lg +6

On the Learnability of Watermarks for Language Models 12 hours ago | arxiv.org

abstract arxiv cs.cl cs.cr +17

StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization 12 hours ago | arxiv.org

abstract arxiv capabilities cs.ai +14

Evaluating Generative Ad Hoc Information Retrieval 12 hours ago | arxiv.org

abstract advances arxiv cs.cl +19

Language Models As Semantic Indexers 12 hours ago | arxiv.org

arxiv cs.cl cs.ir cs.lg +4

Large language models can accurately predict searcher preferences 12 hours ago | arxiv.org

abstract arxiv cs.ai cs.cl +16

On the Reliability of Watermarks for Large Language Models 12 hours ago | arxiv.org

abstract arxiv become bots +28

A Watermark for Large Language Models 12 hours ago | arxiv.org

abstract arxiv cs.cl cs.cr +16

CreoleVal: Multilingual Multitask Benchmarks for Creoles 12 hours ago | arxiv.org

abstract annotated data arxiv benchmarks +14

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Data Engineer (m/f/d)

@ Project A Ventures | Berlin, Germany

View on ai-jobs.net

Principle Research Scientist

@ Analog Devices | US, MA, Boston

View on ai-jobs.net