[R] Infinite mixture of experts - possible? | allainews.com

April 6, 2024, 9:35 p.m. | /u/CriticalTemperature1

Machine Learning www.reddit.com

I've been looking into MoE models to understand why they work so well, and was wondering if there are ways to create effectively "infinite" experts that can route to any number of parameters required for the task? As an example, Mixtral MoE uses 8 experts but the architecture can scale to more: [https://arxiv.org/abs/2401.04088](https://arxiv.org/abs/2401.04088)

The idea is to have some logic that can choose which weights to multiply ahead of time, but I have a feeling that the computation required to …

computation computing logic machinelearning

More from www.reddit.com / Machine Learning

[R] A new method for structured pruning of neural networks with no threshold tuning 11 hours ago | www.reddit.com

bayesian combination convolutional efficiency +10

[D] - What is the latest in fusing the probability distribution outputs of LLMs with … 14 hours ago | www.reddit.com

distribution latest llms machinelearning +5

[P] mamba.np: pure NumPy implementation of Mamba 14 hours ago | www.reddit.com

code cpu machinelearning mamba +5

[R] xLSTM official code + Kilcher video 15 hours ago | www.reddit.com

code finally implementation improvement +13

[R] MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark 19 hours ago | www.reddit.com

abstract age benchmark benchmarks +14

[R] A Study in Dataset Pruning for Image Super-Resolution 21 hours ago | www.reddit.com

core dataset image loss +11

[D] Mamba2 SSD contractions visualized as a tensor network 1 day ago | www.reddit.com

authors code dimensions machinelearning +4

[D] Vector Neural Networks (VNNs) – Enhancing Geometric Deep Learning with 2D Vector Neurons and … 1 day, 4 hours ago | www.reddit.com

architecture capabilities community deep learning +13

[D] Who are some researchers to follow in the field of Model Evaluation and Model … 1 day, 6 hours ago | www.reddit.com

evaluation good interpretability lists +4

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Director, Venture Capital - Artificial Intelligence

@ Condé Nast | San Jose, CA

View on ai-jobs.net

Senior Molecular Imaging Expert (Senior Principal Scientist)

@ University of Sydney | Cambridge (USA)

View on ai-jobs.net