[R] Mixtures of Experts Unlock Parameter Scaling for Deep RL | allainews.com

Feb. 16, 2024, 3:02 a.m. | /u/OwnAd9305

Machine Learning www.reddit.com

Abstract:
> The recent rapid progress in (self) supervised learning models is in large part predicted by empirical scaling laws: a model’s performance scales proportionally to its size. Analogous scaling laws remain elusive for reinforcement learning domains, however, where increasing the parameter count of a model often hurts its final performance. In this paper, we demonstrate that incorporating Mixture-of-Expert (MoE) modules, and in particular Soft MoEs (Puigcerver et al., 2023), into value-based networks results in more parameter-scalable models, evidenced by …

abstract count deep rl domains experts laws machinelearning part performance progress reinforcement reinforcement learning scaling s performance supervised learning

More from www.reddit.com / Machine Learning

[D] What are the most common and significant challenges moving your LLM (application/system) to production? 2 hours ago | www.reddit.com

application building challenges companies +10

[P] NLLB-200 Distill 350M for en-ko 9 hours ago | www.reddit.com

cpu english good gpu +9

[D] Real talk about RAG 16 hours ago | www.reddit.com

data deal documents machinelearning +5

[P] Classification finetuning experiments on small GPT-2 sized LLMs 22 hours ago | www.reddit.com

acc classification context cpu +16

[D] Llama-3 based OpenBioLLM-70B & 8B: Outperforms GPT-4, Gemini, Meditron-70B, Med-PaLM-1 & Med-PaLM-2 in Medical-domain 22 hours ago | www.reddit.com

70b art biomedical domain +16

How do I convince my superior to do data preprocessing? [D] 22 hours ago | www.reddit.com

ai engineer build chat chatbots +11

[D] Llama-3 based OpenBioLLM-70B & 8B: Outperforms GPT-4, Gemini, Meditron-70B, Med-PaLM-1 & Med-PaLM-2 in Medical-domain 23 hours ago | www.reddit.com

70b art biomedical domain +16

[D] Mathematical aspects of tokenization 1 day, 1 hour ago | www.reddit.com

compression educational encoding entropy +7

[R] Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey 1 day, 2 hours ago | www.reddit.com

abstract advancement application challenges +15

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Software Engineering Manager, Generative AI - Characters

@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA

View on ai-jobs.net

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States

View on ai-jobs.net