March 7, 2024, 5:48 a.m. | Indraneil Paul, Jun Luo, Goran Glava\v{s}, Iryna Gurevych

cs.CL updates on arXiv.org arxiv.org

arXiv:2403.03894v1 Announce Type: cross
Abstract: Code understanding and generation have fast become some of the most popular applications of language models (LMs). Nonetheless, research on multilingual aspects of Code-LMs (i.e., LMs for code generation) such as cross-lingual transfer between different programming languages, language-specific data augmentation, and post-hoc LM adaptation, alongside exploitation of data sources other than the original textual content, has been much sparser than for their natural language counterparts. In particular, most mainstream Code-LMs have been pre-trained on source …

abstract applications arxiv augmentation become code code generation code understanding cross-lingual cs.ai cs.cl cs.pl data generators intermediate language language models languages lms multilingual popular programming programming languages research robust transfer type understanding

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne