Aug. 30, 2023, 7:59 p.m. | /u/AIsupercharged

Artificial Intelligence www.reddit.com

Recent research proposes that Large Language Models (LLMs) may not be as reliable as we think. In fact, the order of options in a multiple-choice question drastically influences the responses from LLMs such as GPT-4 and InstructGPT.

If you want to stay on top of the latest trends and insights in AI and tech, [look here first.](https://supercharged-ai.beehiiv.com/subscribe?utm_source=reddit&utm_medium=llm-performance&utm_campaign=campaign)

https://preview.redd.it/dxfsq72kzalb1.png?width=1289&format=png&auto=webp&s=e4ed5b541073bde18d2865f2c15e8028388070f5

**What are the findings?**

* **LLM sensitivity to multiple-choice arrangement:** The study suggests if options in multiple-choice questions are reordered, the LLM's …

artificial gpt gpt-4 instructgpt language language models large language large language models llm llm performance llms multiple performance questions research responses sensitivity study think

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote