all AI news
Topic: moe
[D] Are there any MoE models other than LLMs?
2 days, 1 hour ago |
www.reddit.com
Routers in Vision Mixture of Experts: An Empirical Study
3 days, 6 hours ago |
arxiv.org
MIXTRAL 8x22B: The BEST MoE Just got Better | RAG and Function Calling
3 days, 23 hours ago |
www.youtube.com
Era of Hyper-Real AI Videos is here 🤯
6 days, 22 hours ago |
unwindai.substack.com
[D] How does a MoE router learn when it has made a wrong choice?
2 weeks, 4 days ago |
www.reddit.com
Jamba: A Hybrid Transformer-Mamba Language Model
3 weeks, 3 days ago |
arxiv.org
[D] What's your go-to simple MoE training code project?
3 weeks, 4 days ago |
www.reddit.com
JAMBA MoE: Open Source MAMBA w/ Transformer: CODE
3 weeks, 5 days ago |
www.youtube.com
[D] I don't understand how backprop works on sparsely gated MoE
1 month, 1 week ago |
www.reddit.com
Applying Mixture of Experts in LLM Architectures
1 month, 1 week ago |
developer.nvidia.com
Octavius: Mitigating Task Interference in MLLMs via MoE
1 month, 1 week ago |
arxiv.org
Vanilla Transformers are Transfer Capability Teachers
1 month, 2 weeks ago |
arxiv.org
SADMoE: Exploiting Activation Sparsity with Dynamic-k Gating
1 month, 4 weeks ago |
arxiv.org
MIXTRAL 8x22B: The BEST MoE Just got Better | RAG and Function Calling
3 days, 23 hours ago |
www.youtube.com
Era of Hyper-Real AI Videos is here 🤯
6 days, 22 hours ago |
unwindai.substack.com
Items published with this topic over the last 90 days.
Latest
[D] Are there any MoE models other than LLMs?
2 days, 1 hour ago |
www.reddit.com
Routers in Vision Mixture of Experts: An Empirical Study
3 days, 6 hours ago |
arxiv.org
MIXTRAL 8x22B: The BEST MoE Just got Better | RAG and Function Calling
3 days, 23 hours ago |
www.youtube.com
Era of Hyper-Real AI Videos is here 🤯
6 days, 22 hours ago |
unwindai.substack.com
[D] How does a MoE router learn when it has made a wrong choice?
2 weeks, 4 days ago |
www.reddit.com
Jamba: A Hybrid Transformer-Mamba Language Model
3 weeks, 3 days ago |
arxiv.org
[D] What's your go-to simple MoE training code project?
3 weeks, 4 days ago |
www.reddit.com
JAMBA MoE: Open Source MAMBA w/ Transformer: CODE
3 weeks, 5 days ago |
www.youtube.com
[D] I don't understand how backprop works on sparsely gated MoE
1 month, 1 week ago |
www.reddit.com
Applying Mixture of Experts in LLM Architectures
1 month, 1 week ago |
developer.nvidia.com
Octavius: Mitigating Task Interference in MLLMs via MoE
1 month, 1 week ago |
arxiv.org
Vanilla Transformers are Transfer Capability Teachers
1 month, 2 weeks ago |
arxiv.org
SADMoE: Exploiting Activation Sparsity with Dynamic-k Gating
1 month, 4 weeks ago |
arxiv.org
Topic trend (last 90 days)
Top (last 7 days)
MIXTRAL 8x22B: The BEST MoE Just got Better | RAG and Function Calling
3 days, 23 hours ago |
www.youtube.com
Era of Hyper-Real AI Videos is here 🤯
6 days, 22 hours ago |
unwindai.substack.com
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Alternant Data Engineering
@ Aspire Software | Angers, FR
Senior Software Engineer, Generative AI
@ Google | Dublin, Ireland