April 16, 2024, 8:46 p.m. | /u/dancleary544

Ai Prompt Programming www.reddit.com

Recently stumbled upon a [paper from Durham University](https://arxiv.org/pdf/2403.16977.pdf) that pitted physics students against GPT-3.5 and GPT-4 in a university-level coding assignment.


I really liked the study because unlike benchmarks which can be fuzzy or misleading, this was a good, controlled, case study of humans vs AI on a specific task.

At a high level here were the main takeaways:

\- Students outperformed the AI models, scoring 91.9% compared to 81.1% for the best-performing AI method (GPT-4 with prompt engineering).
\- …

ai models aipromptprogramming benchmarks case case study engineering good gpt gpt-4 humans prompt scoring students study

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote