Evaluation of the Programming Skills of Large Language Models | allainews.com

May 24, 2024, 4:55 a.m. | Luc Bryan Heitz, Joun Chamas, Christopher Scherb

cs.CL updates on arXiv.org arxiv.org

arXiv:2405.14388v1 Announce Type: cross
Abstract: The advent of Large Language Models (LLM) has revolutionized the efficiency and speed with which tasks are completed, marking a significant leap in productivity through technological innovation. As these chatbots tackle increasingly complex tasks, the challenge of assessing the quality of their outputs has become paramount. This paper critically examines the output quality of two leading LLMs, OpenAI's ChatGPT and Google's Gemini AI, by comparing the quality of programming code generated in both their free …

abstract arxiv become challenge chatbots cs.cl cs.cr cs.se efficiency evaluation innovation language language models large language large language models llm productivity programming quality skills speed tasks through type

More from arxiv.org / cs.CL updates on arXiv.org

Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications 2 days, 11 hours ago | arxiv.org

abstract applications arxiv challenge +26

Unlearning Traces the Influential Training Data of Language Models 2 days, 11 hours ago | arxiv.org

abstract arxiv cs.ai cs.cl +17

Axis Tour: Word Tour Determines the Order of Axes in ICA-transformed Embeddings 2 days, 11 hours ago | arxiv.org

abstract analysis arxiv components +20

Japanese Tort-case Dataset for Rationale-supported Legal Judgment Prediction 2 days, 11 hours ago | arxiv.org

abstract arxiv case court +14

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI 2 days, 11 hours ago | arxiv.org

abstract agi art arxiv +21

ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology 2 days, 11 hours ago | arxiv.org

abstract arxiv benchmark benchmarks +19

MC$^2$: Towards Transparent and Culturally-Aware NLP for Minority Languages in China 2 days, 11 hours ago | arxiv.org

abstract accessibility arxiv challenge +19

Dodo: Dynamic Contextual Compression for Decoder-only LMs 2 days, 11 hours ago | arxiv.org

abstract arxiv attention compression +23

Active Learning for Multilingual Fingerspelling Corpora 2 days, 11 hours ago | arxiv.org

abstract active learning analysis apply +16

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

Junior Data Analyst - ESG Data

@ Institutional Shareholder Services | Mumbai

View on ai-jobs.net

Intern Data Driven Development in Sensor Fusion for Autonomous Driving (f/m/x)

@ BMW Group | Munich, DE

View on ai-jobs.net

Senior MLOps Engineer, Machine Learning Platform

@ GetYourGuide | Berlin

View on ai-jobs.net

Data Engineer, Analytics

@ Meta | Menlo Park, CA

View on ai-jobs.net

Data Engineer

@ Meta | Menlo Park, CA

View on ai-jobs.net