May 5, 2022, 1:11 a.m. | Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn

cs.CL updates on arXiv.org arxiv.org

Large language models (LMs) of code have recently shown tremendous promise in
completing code and synthesizing code from natural language descriptions.
However, the current state-of-the-art code LMs (e.g., Codex (Chen et al.,
2021)) are not publicly available, leaving many questions about their model and
data design decisions. We aim to fill in some of these blanks through a
systematic evaluation of the largest existing models: Codex, GPT-J, GPT-Neo,
GPT-NeoX-20B, and CodeParrot, across various programming languages. Although
Codex itself is not …

arxiv code evaluation language language models large language models pl

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Social Insights & Data Analyst (Freelance)

@ Media.Monks | Jakarta

Cloud Data Engineer

@ Arkatechture | Portland, ME, USA