all AI news
Mixtral of Experts
Jan. 9, 2024, 4:03 a.m. |
Simon Willison's Weblog simonwillison.net
The Mixtral paper is out, exactly a month after the release of the Mixtral 8x7B model itself. Thanks to the paper I now have a reasonable understanding of how a mixture of experts model works: each layer has 8 available blocks, but a router model selects two out of those eight for each token passing through that layer and combines their output. "As a result, each token has access to 47B parameters, but only uses 13B active …
ai experts generativeai layer llms mistral mixtral mixtral 8x7b mixture of experts paper release token understanding
More from simonwillison.net / Simon Willison's Weblog
AI counter app from my PyCon US keynote
1 day, 1 hour ago |
simonwillison.net
Understand errors and warnings better with Gemini
1 day, 19 hours ago |
simonwillison.net
PSF announces a new five year commitment from Fastly
2 days, 3 hours ago |
simonwillison.net
Programming mantras are proverbs
2 days, 5 hours ago |
simonwillison.net
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US