June 15, 2022, 1:12 a.m. | Daoguang Zan, Bei Chen, Dejian Yang, Zeqi Lin, Minsu Kim, Bei Guan, Yongji Wang, Weizhu Chen, Jian-Guang Lou

cs.CL updates on arXiv.org arxiv.org

Code generation is a longstanding challenge, aiming to generate a code
snippet based on a natural language description. Usually, expensive text-code
paired data is essential for training a code generation model. Recently, thanks
to the success of pre-training techniques, large language models are trained on
large-scale unlabelled code corpora and perform well in code generation. In
this paper, we investigate how to leverage an unlabelled code corpus to train a
model for library-oriented code generation. Since it is a common …

arxiv code code generation continual generation library pre-training training

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Alternant Data Engineering

@ Aspire Software | Angers, FR

Senior Software Engineer, Generative AI

@ Google | Dublin, Ireland