480B LLM as 128x4B MoE? WHY? | allainews.com

April 26, 2024, noon | code_your_own_AI

code_your_own_AI www.youtube.com

Snowflake offers unique tech insights in new architecture designs of LLM, particular its new 128x3.66B Mixture of Expert (MoE) system. Dedicated, highly specialized LLMs for particular tasks.

Short introduction to MoE and then a comparison between different model architectures, followed up by a causal reasoning test (following test suite published by Stanford Univ).

Can a relatively small LLM, with below eg 5 Billion free trainable parameters, solve complex reasoning tasks. We evaluated this in my last video on PHI-3 MINI. …

architecture architectures causal comparison designs expert insights introduction llm llms moe reasoning snowflake stanford tasks tech tech insights test unique

More from www.youtube.com / code_your_own_AI

New xLSTM explained: Better than Transformer LLMs? 1 day, 19 hours ago | www.youtube.com

advanced alternative core covariance +11

Stealth LLM: im-a-good-gpt2-chatbot 3 days, 19 hours ago | www.youtube.com

chatbot good gpt2 gpt2-chatbot +15

Understand DSPy: Programming AI Pipelines 5 days, 19 hours ago | www.youtube.com

case dspy engineering evolution +9

Latest Insights in AI Performance Models 1 week ago | www.youtube.com

ai performance ai research benchmarks beyond +20

New Discovery: Retrieval Heads for Long Context 1 week, 2 days ago | www.youtube.com

applications attention context dev +15

Multi-Token Prediction (forget next token LLM?) 1 week, 3 days ago | www.youtube.com

architecture autoregressive benchmark data +13

NEW LLM Test: Reasoning & gpt2-chatbot 1 week, 5 days ago | www.youtube.com

blind causal chatbot gpt2-chatbot +8

LLMs: Rewriting Our Tomorrow (plus code) #ai 1 week, 6 days ago | www.youtube.com

ai systems code effects future +10

Autonomous AI Agents: 14 % MAX Performance 2 weeks ago | www.youtube.com

agents ai agents autonomous autonomous agents +14

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net