March 27, 2024, 4:49 a.m. | Razan Baltaji, Saurabh Pujar, Louis Mandel, Martin Hirzel, Luca Buratti, Lav Varshney

cs.CL updates on arXiv.org arxiv.org

arXiv:2310.16937v2 Announce Type: replace
Abstract: Large language models (LLMs) have become remarkably good at improving developer productivity for high-resource programming languages. These models use two kinds of data: large amounts of unlabeled code samples for pre-training and relatively smaller amounts of labeled code samples for fine-tuning or in-context learning. Unfortunately, many programming languages are low-resource, lacking labeled samples for most tasks and often even lacking unlabeled samples. Therefore, users of low-resource languages (e.g., legacy or new languages) miss out on …

abstract arxiv become code context cs.cl data developer developer productivity fine-tuning good improving in-context learning language language models languages large language large language models llms pre-training productivity programming programming languages samples training type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Principal Applied Scientist

@ Microsoft | Redmond, Washington, United States

Data Analyst / Action Officer

@ OASYS, INC. | OASYS, INC., Pratt Avenue Northwest, Huntsville, AL, United States