all AI news
[D] StrategyQA may contain far more errors than we previously thought
Jan. 1, 2024, 10:12 a.m. | /u/Radiant_Routine_3183
Machine Learning www.reddit.com
Over the New Year holiday, inspired by the paper from [here](https://www.semanticscholar.org/paper/Did-Aristotle-Use-a-Laptop-A-Question-Answering-Geva-Khashabi/346081161bdc8f18e2a4c4af7f51d35452b5cb01), I tried to evaluate the OpenAI models across various datasets, including StrategyQA.
In short, this dataset contains many questions about multi-step reasoning and common sense. Here's an example:
{
"qid": "e1f10b57579fa6a92aa9",
"term": "Martin Luther",
"description": "Saxon priest, monk and theologian, seminal figure in Protestant Reformation",
"question": "Did Martin Luther believe in Satan?",
"answer": true,
"facts": [
"Martin Luther was a Protestant.",
"Satan is also known as the devil.", …
common sense dataset errors example facts figure machinelearning question questions reasoning sense thought true
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote