The GPT-4 barrier has finally been smashed | allainews.com

s

March 8, 2024, 6:02 p.m. |

Simon Willison's Weblog simonwillison.net

Four weeks ago, GPT-4 remained the undisputed champion: consistently at the top of every key benchmark, but more importantly the clear winner in terms of "vibes". Almost everyone investing serious time exploring LLMs agreed that it was the most capable default model for the majority of tasks - and had been for more than a year.

Today that barrier has finally been smashed. We have four new models, all released to the public in the last four weeks, that are …

ai anthropic benchmark claude clear every finally generativeai gpt gpt-4 gpt4 investing key llms mistral openai tasks terms vibes

More from simonwillison.net / Simon Willison's Weblog

Si

Printing music with CSS Grid 4 hours ago | simonwillison.net

application bond column css +10

Si

We can have a different web 16 hours ago | simonwillison.net

audio dog headphones mollywhite +2

Si

Quoting Tom Eastman 16 hours ago | simonwillison.net

five internet remember when text +2

Si

Llama 3 prompt formats 1 day ago | simonwillison.net

ai clear documentation every +12

Si

Introducing the Claude Team plan and iOS app 1 day, 2 hours ago | simonwillison.net

access anthropic app claude +11

Si

Save the Web by Being Nice 1 day, 16 hours ago | simonwillison.net

andrew article blog blogging +6

Si

Quoting LMSYS 1 day, 22 hours ago | simonwillison.net

ai api commercial community +9

Si

Quoting D. Richard Hipp 2 days, 4 hours ago | simonwillison.net

analysis code cpu decoding +11

Si

How an empty S3 bucket can make your AWS bill explode 2 days, 7 hours ago | simonwillison.net

aws bill empty s3 +4

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

#13721 - Data Engineer - AI Model Testing

@ Qualitest | Miami, Florida, United States

View on ai-jobs.net

Elasticsearch Administrator

@ ManTech | 201BF - Customer Site, Chantilly, VA

View on ai-jobs.net