Mixture-of-Experts and Trends in Large-Scale Language Modeling with Irwan Bello - #569 | allainews.com

April 25, 2022, 4:55 p.m. | Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) twimlai.com

Today we’re joined by Irwan Bello, formerly a research scientist at Google Brain, and now on the founding team at a stealth AI startup. We begin our conversation with an exploration of Irwan’s recent paper, Designing Effective Sparse Expert Models, which acts as a design guide for building sparse large language model architectures. We discuss mixture of experts as a technique, the scalability of this method, and it's applicability beyond NLP tasks the data sets this experiment was benchmarked against. …

experts language modeling scale trends

More from twimlai.com / The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - #681 2 days, 16 hours ago | twimlai.com

ai applications applications architecture ceo +13

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - #681 2 days, 16 hours ago | twimlai.com

ai applications applications architecture ceo +13

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - #681 2 days, 16 hours ago | twimlai.com

ai applications applications architecture ceo +13

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680 1 week, 1 day ago | twimlai.com

alex algorithms creativity discuss +15

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680 1 week, 1 day ago | twimlai.com

alex algorithms creativity discuss +15

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680 1 week, 1 day ago | twimlai.com

alex algorithms creativity discuss +15

Localizing and Editing Knowledge in LLMs with Peter Hase - #679 2 weeks, 2 days ago | twimlai.com

decisions discuss editing explore +15

Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678 3 weeks, 2 days ago | twimlai.com

agents explore highlighting institute +9

V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - #677 4 weeks, 2 days ago | twimlai.com

ai reasoning ai research architecture artificial +21

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Management Associate

@ EcoVadis | Ebène, Mauritius

View on ai-jobs.net

Senior Data Engineer

@ Telstra | Telstra ICC Bengaluru

View on ai-jobs.net