April 6, 2024, 9:35 p.m. | /u/CriticalTemperature1

Machine Learning www.reddit.com

I've been looking into MoE models to understand why they work so well, and was wondering if there are ways to create effectively "infinite" experts that can route to any number of parameters required for the task? As an example, Mixtral MoE uses 8 experts but the architecture can scale to more: [https://arxiv.org/abs/2401.04088](https://arxiv.org/abs/2401.04088)

The idea is to have some logic that can choose which weights to multiply ahead of time, but I have a feeling that the computation required to …

computation computing logic machinelearning

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Director, Venture Capital - Artificial Intelligence

@ Condé Nast | San Jose, CA

Senior Molecular Imaging Expert (Senior Principal Scientist)

@ University of Sydney | Cambridge (USA)