Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation | allainews.com

March 6, 2024, 5:48 a.m. | Bin Zhang, Yuxiao Ye, Guoqing Du, Xiaoru Hu, Zhishuai Li, Sun Yang, Chi Harold Liu, Rui Zhao, Ziyue Li, Hangyu Mao

cs.CL updates on arXiv.org arxiv.org

arXiv:2403.02951v1 Announce Type: new
Abstract: Large Language Models (LLMs) have emerged as a powerful tool in advancing the Text-to-SQL task, significantly outperforming traditional methods. Nevertheless, as a nascent research field, there is still no consensus on the optimal prompt templates and design frameworks. Additionally, existing benchmarks inadequately explore the performance of LLMs across the various sub-tasks of the Text-to-SQL process, which hinders the assessment of LLMs' cognitive capabilities and the optimization of LLM-based solutions.To address the aforementioned issues, we firstly …

abstract arxiv benchmarking benchmarks capability consensus cs.ai cs.cl design evaluation explore frameworks language language models large language large language models llms prompt research sql text text-to-sql tool type

More from arxiv.org / cs.CL updates on arXiv.org

AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback 7 hours ago | arxiv.org

abstract application arxiv challenges +23

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition 7 hours ago | arxiv.org

abstract applications architecture arxiv +15

Can language models learn analogical reasoning? Investigating training objectives and comparisons to human performance 7 hours ago | arxiv.org

abstract arxiv cs.cl embeddings +15

BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models 7 hours ago | arxiv.org

abstract arxiv augmentation binary +19

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models 7 hours ago | arxiv.org

abstract arxiv bootstrap bridge +17

PSentScore: Evaluating Sentiment Polarity in Dialogue Summarization 7 hours ago | arxiv.org

abstract arxiv conversations cs.cl +12

Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets 7 hours ago | arxiv.org

abstract arxiv asr automatic speech recognition +22

Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph 7 hours ago | arxiv.org

abstract arxiv beyond cs.ai +20

Tabular Embedding Model (TEM): Finetuning Embedding Models For Tabular RAG Applications 7 hours ago | arxiv.org

abstract applications art arxiv +26

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

DevOps Engineer (Data Team)

@ Reward Gateway | Sofia/Plovdiv

View on ai-jobs.net