Andrej Karpathy's Llama 3 review

April 18, 2024, 8:50 p.m. |

Simon Willison's Weblog simonwillison.net

The most interesting coverage I've seen so far of Meta's Llama 3 models (8b and 70b so far, 400b promised later).

Andrej notes that Llama 3 trained on 15 trillion tokens - up from 2 trillion for Llama 2 - and they used that many even for the smaller 8b model, 75x more than the chinchilla scaling laws would suggest.

The tokenizer has also changed - they now use 128,000 tokens, up from 32,000. This …

70b ai andrej karpathy andrejkarpathy coverage generativeai llama llama 2 llama 3 llms meta notes review tokens

Visit resource

More from simonwillison.net / Simon Willison's Weblog

LLM 0.14, with support for GPT-4o 16 hours ago | simonwillison.net

ai generativeai gpt gpt-4o +10

Hello GPT-4o 18 hours ago | simonwillison.net

ai arena chat chatbot +13

Quoting Tim Paul 22 hours ago | simonwillison.net

ai ai technologies architecture building +10

GPUs Go Brrr 1 day, 9 hours ago | simonwillison.net

ai figure flat gpus +12

Parsing PNG images in Mojo 1 day, 16 hours ago | simonwillison.net

building chris chris lattner code +13

About ARDC (Amateur Radio Digital Communications) 1 day, 19 hours ago | simonwillison.net

advance block communication communications +8

“Link In Bio” is a slow knife 1 day, 23 hours ago | simonwillison.net

anildash bio dash instagram +7

Ham radio general exam question pool as JSON 2 days, 17 hours ago | simonwillison.net

data datasette exam general +12

Exploring Hacker News by mapping and analyzing 40 million posts and comments for fun 3 days, 20 hours ago | simonwillison.net

api data data engineering embeddings +11

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

all AI news

Andrej Karpathy's Llama 3 review

More from simonwillison.net / Simon Willison's Weblog

Jobs in AI, ML, Big Data

Data Engineer

Artificial Intelligence – Bioinformatic Expert

Lead Developer (AI)

Research Engineer

Ecosystem Manager

Founding AI Engineer, Agents