StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback | allainews.com

Feb. 5, 2024, 6:48 a.m. | Shihan Dou Yan Liu Haoxiang Jia Limao Xiong Enyu Zhou Junjie Shan Caishuang Huang Wei Shen

cs.CL updates on arXiv.org arxiv.org

The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code generation quality. However, the lengthy code generated by LLMs in response to complex human requirements makes RL exploration a challenge. Also, since the unit tests may not cover the complicated code, optimizing LLMs by using these unexecuted code snippets is ineffective. To tackle these challenges, we …

advancement code code generation compiler cs.cl cs.se feedback generated human language language models large language large language models llms quality reinforcement reinforcement learning requirements space work

More from arxiv.org / cs.CL updates on arXiv.org

CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory Accelerators 13 hours ago | arxiv.org

abstract accelerators architectures arxiv +13

CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning 13 hours ago | arxiv.org

arxiv benchmark chinese cs.ai +8

Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models 13 hours ago | arxiv.org

abstract advances arxiv cs.cl +16

An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT 13 hours ago | arxiv.org

abstract arxiv chatgpt communication +14

Commentary Generation from Data Records of Multiplayer Strategy Esports Game 13 hours ago | arxiv.org

abstract arxiv audience become +20

Honeyfile Camouflage: Hiding Fake Files in Plain Sight 13 hours ago | arxiv.org

abstract arxiv challenge cosine +13

You Only Cache Once: Decoder-Decoder Architectures for Language Models 13 hours ago | arxiv.org

architectures arxiv cache cs.cl +4

Open Source Language Models Can Provide Feedback: Evaluating LLMs' Ability to Help Students Using GPT-4-As-A-Judge 13 hours ago | arxiv.org

abstract arxiv computing concerns +23

LLMs with Personalities in Multi-issue Negotiation Games 13 hours ago | arxiv.org

abstract agents ai agents arxiv +26

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net