Feb. 28, 2024, 5:42 a.m. | Biao Zhang, Zhongtao Liu, Colin Cherry, Orhan Firat

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.17193v1 Announce Type: cross
Abstract: While large language models (LLMs) often adopt finetuning to unlock their capabilities for downstream applications, our understanding on the inductive biases (especially the scaling properties) of different finetuning methods is still limited. To fill this gap, we conduct systematic experiments studying whether and how different scaling factors, including LLM model size, pretraining data size, new finetuning parameter size and finetuning data size, affect the finetuning performance. We consider two types of finetuning -- full-model tuning …

abstract applications arxiv biases capabilities cs.cl cs.lg data finetuning gap inductive language language models large language large language models llm llms scaling studying type understanding

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Principal Data Engineering Manager

@ Microsoft | Redmond, Washington, United States

Machine Learning Engineer

@ Apple | San Diego, California, United States