all AI news
Andrej Karpathy's Llama 3 review
April 18, 2024, 8:50 p.m. |
Simon Willison's Weblog simonwillison.net
Andrej Karpathy's Llama 3 review
The most interesting coverage I've seen so far of Meta's Llama 3 models (8b and 70b so far, 400b promised later).
Andrej notes that Llama 3 trained on 15 trillion tokens - up from 2 trillion for Llama 2 - and they used that many even for the smaller 8b model, 75x more than the chinchilla scaling laws would suggest.
The tokenizer has also changed - they now use 128,000 tokens, up from 32,000. This …
70b ai andrej karpathy andrejkarpathy coverage generativeai llama llama 2 llama 3 llms meta notes review tokens
More from simonwillison.net / Simon Willison's Weblog
How an empty S3 bucket can make your AWS bill explode
2 days, 1 hour ago |
simonwillison.net
My approach to HTML web components
2 days, 1 hour ago |
simonwillison.net
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Business Data Scientist, gTech Ads
@ Google | Mexico City, CDMX, Mexico
Lead, Data Analytics Operations
@ Zocdoc | Pune, Maharashtra, India