Evaluation of the Programming Skills of Large Language Models | allainews.com

May 24, 2024, 4:55 a.m. | Luc Bryan Heitz, Joun Chamas, Christopher Scherb

cs.CL updates on arXiv.org arxiv.org

arXiv:2405.14388v1 Announce Type: cross
Abstract: The advent of Large Language Models (LLM) has revolutionized the efficiency and speed with which tasks are completed, marking a significant leap in productivity through technological innovation. As these chatbots tackle increasingly complex tasks, the challenge of assessing the quality of their outputs has become paramount. This paper critically examines the output quality of two leading LLMs, OpenAI's ChatGPT and Google's Gemini AI, by comparing the quality of programming code generated in both their free …

abstract arxiv become challenge chatbots cs.cl cs.cr cs.se efficiency evaluation innovation language language models large language large language models llm productivity programming quality skills speed tasks through type

More from arxiv.org / cs.CL updates on arXiv.org

Open Models, Closed Minds? On Agents Capabilities in Mimicking Human Personalities through Open Large Language … 22 hours ago | arxiv.org

abstract agents arxiv capabilities +22

Attribute Diversity Determines the Systematicity Gap in VQA 22 hours ago | arxiv.org

abstract arxiv concepts cs.cl +16

Linking Representations with Multimodal Contrastive Learning 22 hours ago | arxiv.org

abstract applications arxiv cs.cl +17

A Survey on Neural Topic Models: Methods, Applications, and Challenges 22 hours ago | arxiv.org

applications arxiv challenges cs.ai +5

Revisiting Demonstration Selection Strategies in In-Context Learning 22 hours ago | arxiv.org

abstract arxiv context context learning +15

The Impact of Reasoning Step Length on Large Language Models 22 hours ago | arxiv.org

arxiv cs.ai cs.cl impact +7

Improving In-context Learning via Bidirectional Alignment 22 hours ago | arxiv.org

abstract alignment arxiv capabilities +24

Collaboration or Corporate Capture? Quantifying NLP's Reliance on Industry Artifacts and Contributions 22 hours ago | arxiv.org

abstract artifacts arxiv attention +19

Graph Elicitation for Guiding Multi-Step Reasoning in Large Language Models 22 hours ago | arxiv.org

abstract arxiv capabilities cs.ai +20

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

View on ai-jobs.net

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

Associate Director, IT Business Partner, Cell Therapy Analytical Development

@ Bristol Myers Squibb | Warren - NJ

View on ai-jobs.net

Solutions Architect

@ Lloyds Banking Group | London 125 London Wall

View on ai-jobs.net

Senior Lead Cloud Engineer

@ S&P Global | IN - HYDERABAD ORION

View on ai-jobs.net

Software Engineer

@ Applied Materials | Bengaluru,IND

View on ai-jobs.net