all AI news
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?
Feb. 21, 2024, 5:48 a.m. | Nishant Balepur, Abhilasha Ravichander, Rachel Rudinger
cs.CL updates on arXiv.org arxiv.org
Abstract: Multiple-choice question answering (MCQA) is often used to evaluate large language models (LLMs). To see if MCQA assesses LLMs as intended, we probe if LLMs can perform MCQA with choices-only prompts, where models must select the correct answer only from the choices. In three MCQA datasets and four LLMs, this prompt bests a majority baseline in 11/12 cases, with up to 0.33 accuracy gain. To help explain this behavior, we conduct an in-depth, black-box analysis …
abduction abstract arxiv cs.cl language language models large language large language models llms multiple probe prompts question question answering questions type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne