March 11, 2024, 4:41 a.m. | Martin Riddell, Ansong Ni, Arman Cohan

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.04811v1 Announce Type: cross
Abstract: While large language models have achieved remarkable performance on various code generation benchmarks, there have been growing concerns regarding potential contamination of these benchmarks as they may be leaked into pretraining and finetuning data. While recent work has investigated contamination in natural language generation and understanding tasks, there has been less extensive research into how data contamination impacts the evaluation of code generation, which is critical for understanding the robustness and reliability of LLMs in …

abstract arxiv benchmarks capabilities code code generation concerns cs.cl cs.lg cs.se data finetuning language language generation language models large language large language models leaked natural natural language natural language generation performance pretraining type work

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US