Shifting order in multiple-choice questions massively affects LLM performance | allainews.com

Aug. 30, 2023, 7:59 p.m. | /u/AIsupercharged

Artificial Intelligence www.reddit.com

Recent research proposes that Large Language Models (LLMs) may not be as reliable as we think. In fact, the order of options in a multiple-choice question drastically influences the responses from LLMs such as GPT-4 and InstructGPT.

If you want to stay on top of the latest trends and insights in AI and tech, [look here first.](https://supercharged-ai.beehiiv.com/subscribe?utm_source=reddit&utm_medium=llm-performance&utm_campaign=campaign)

https://preview.redd.it/dxfsq72kzalb1.png?width=1289&format=png&auto=webp&s=e4ed5b541073bde18d2865f2c15e8028388070f5

**What are the findings?**

* **LLM sensitivity to multiple-choice arrangement:** The study suggests if options in multiple-choice questions are reordered, the LLM's …

artificial gpt gpt-4 instructgpt language language models large language large language models llm llm performance llms multiple performance questions research responses sensitivity study think

More from www.reddit.com / Artificial Intelligence

This is BIG. OpenAI just announed, they are partnering with Stack Overflow to use it … 7 hours ago | www.reddit.com

artificial big database database for llm +5

Stretchable e-skin could give robots human-level touch sensitivity 17 hours ago | www.reddit.com

artificial control devices electronic +5

One-Minute Daily AI News 5/7/2024 19 hours ago | www.reddit.com

ai news alphabet artificial chatbot +21

Microsoft readies new AI model to compete with Google, OpenAI 20 hours ago | www.reddit.com

ai language model ai model artificial co-founder +16

AI project - City Council Voting record over the last 3+ years. 21 hours ago | www.reddit.com

ai studio artificial city dating +12

Best tool for upscaling lots of long videos? 1 day, 1 hour ago | www.reddit.com

artificial bonus extract family +9

Looking for an API or Algorithm 1 day, 1 hour ago | www.reddit.com

algorithm api artificial challenges +5

Financial Times latest media outlet to forge a deal with OpenAI 1 day, 5 hours ago | www.reddit.com

artificial deal financial financial times +3

AI Explained: “If GPT-4 can train a robot dog better than we can to balance … 1 day, 12 hours ago | www.reddit.com

artificial balance dog explained +9

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net