A controlled study of humans vs AI (GPT-4). We have the lead, for now! | allainews.com

April 16, 2024, 8:46 p.m. | /u/dancleary544

Ai Prompt Programming www.reddit.com

Recently stumbled upon a [paper from Durham University](https://arxiv.org/pdf/2403.16977.pdf) that pitted physics students against GPT-3.5 and GPT-4 in a university-level coding assignment.

I really liked the study because unlike benchmarks which can be fuzzy or misleading, this was a good, controlled, case study of humans vs AI on a specific task.

At a high level here were the main takeaways:

\- Students outperformed the AI models, scoring 91.9% compared to 81.1% for the best-performing AI method (GPT-4 with prompt engineering).
\- …

ai models aipromptprogramming benchmarks case case study engineering good gpt gpt-4 humans prompt scoring students study

More from www.reddit.com / Ai Prompt Programming

Phi-3 WebGPU: a private and powerful AI chatbot that runs 100% locally in your browser 9 hours ago | www.reddit.com

ai chatbot aipromptprogramming browser chatbot +3

OpenAI's Latest TikTok Video Shows Sora Character Editing Capabilities 2 days, 20 hours ago | www.reddit.com

aipromptprogramming capabilities editing latest +5

LLaMA-3 70B can perform much better in logical reasoning with a task-specific system prompt 3 days, 9 hours ago | www.reddit.com

70b aipromptprogramming llama prompt +1

Transcribe 1-hour videos in 20 SECONDS with Distil Whisper + Hqq(1bit)! 4 days, 4 hours ago | www.reddit.com

aipromptprogramming hour transcribe videos +1

Built a Vercel management app using SwiftUI 4 days, 10 hours ago | www.reddit.com

aipromptprogramming app management swiftui +1

This guy made a plant pot who tells him what his plant needs, using AI 4 days, 10 hours ago | www.reddit.com

aipromptprogramming

Nvidia's Jim Fan: We trained a robot dog to balance and walk on top of … 4 days, 10 hours ago | www.reddit.com

aipromptprogramming balance dog fine-tuning +9

Microsoft bans US police departments from using enterprise AI tool for facial recognition 5 days, 4 hours ago | www.reddit.com

aipromptprogramming ai tool bans enterprise +6

Super excited to launch my open-source Perplexity alternative, source code in comments 6 days, 10 hours ago | www.reddit.com

aipromptprogramming alternative code launch +1

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net