all AI news
Andrej Karpathy's Llama 3 review
April 18, 2024, 8:50 p.m. |
Simon Willison's Weblog simonwillison.net
Andrej Karpathy's Llama 3 review
The most interesting coverage I've seen so far of Meta's Llama 3 models (8b and 70b so far, 400b promised later).
Andrej notes that Llama 3 trained on 15 trillion tokens - up from 2 trillion for Llama 2 - and they used that many even for the smaller 8b model, 75x more than the chinchilla scaling laws would suggest.
The tokenizer has also changed - they now use 128,000 tokens, up from 32,000. This …
70b ai andrej karpathy andrejkarpathy coverage generativeai llama llama 2 llama 3 llms meta notes review tokens
More from simonwillison.net / Simon Willison's Weblog
Parsing PNG images in Mojo
1 day, 16 hours ago |
simonwillison.net
About ARDC (Amateur Radio Digital Communications)
1 day, 19 hours ago |
simonwillison.net
“Link In Bio” is a slow knife
1 day, 23 hours ago |
simonwillison.net
Ham radio general exam question pool as JSON
2 days, 17 hours ago |
simonwillison.net
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York