June 16, 2022, 1:11 a.m. | Saikat Chakraborty, Toufique Ahmed, Yangruibo Ding, Premkumar Devanbu, Baishakhi Ray

cs.LG updates on arXiv.org arxiv.org

Pre-trained Generative Language models (e.g. PLBART, CodeT5, SPT-Code) for
source code yielded strong results on several tasks in the past few years,
including code generation and translation. These models have adopted varying
pre-training objectives to learn statistics of code construction from very
large-scale corpora in a self-supervised fashion; the success of pre-trained
models largely hinges on these pre-training objectives. This paper proposes a
new pre-training objective, "Naturalizing" of source code, exploiting code's
bimodal, dual-channel (formal & natural channels) nature. Unlike …

arxiv code pl pre-training training

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

RL Analytics - Content, Data Science Manager

@ Meta | Burlingame, CA

Research Engineer

@ BASF | Houston, TX, US, 77079