all AI news
[R] GPQA: A Graduate-Level Google-Proof Q&A Benchmark
Nov. 22, 2023, 11:58 a.m. | /u/APaperADay
Machine Learning www.reddit.com
**Code and data**: [https://github.com/idavidrein/gpqa/](https://github.com/idavidrein/gpqa/)
**Abstract**:
>We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry. We ensure that the questions are high-quality and extremely difficult: experts who have or are pursuing PhDs in the corresponding domains reach 65% accuracy (74% when discounting clear mistakes the experts identified in retrospect), while highly skilled non-expert validators only reach 34% accuracy, despite spending on average over 30 minutes with unrestricted access to …
abstract accuracy biology chemistry clear dataset domain domain experts domains expert experts machinelearning mistakes multiple physics quality questions skilled
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York