March 27, 2024, 4:49 a.m. | Razan Baltaji, Saurabh Pujar, Louis Mandel, Martin Hirzel, Luca Buratti, Lav Varshney

cs.CL updates on arXiv.org arxiv.org

arXiv:2310.16937v2 Announce Type: replace
Abstract: Large language models (LLMs) have become remarkably good at improving developer productivity for high-resource programming languages. These models use two kinds of data: large amounts of unlabeled code samples for pre-training and relatively smaller amounts of labeled code samples for fine-tuning or in-context learning. Unfortunately, many programming languages are low-resource, lacking labeled samples for most tasks and often even lacking unlabeled samples. Therefore, users of low-resource languages (e.g., legacy or new languages) miss out on …

abstract arxiv become code context cs.cl data developer developer productivity fine-tuning good improving in-context learning language language models languages large language large language models llms pre-training productivity programming programming languages samples training type

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

NUSolve Innovation Assistant/Associate in Data Science'

@ Newcastle University | Newcastle, GB

Data Engineer (Snowflake)

@ Unit4 | Lisbon, Portugal

Lead Data Engineer

@ Provident Bank | Woodbridge, NJ, US

Specialist Solutions Engineer (Data Science/Machine Learning)

@ Databricks | London, United Kingdom

Staff Software Engineer, Data Mirgrations

@ Okta | Canada