March 30, 2024, 5:36 p.m. | /u/hapliniste

Machine Learning www.reddit.com

Hi, I want to test some ideas regarding training very small (<1B) MoE models just to see if it gives convincing results.

Is there a github repo that's very basic but implement what's required for a real training run and would allow me to tinker with MoE training? Something like [NanoGPT-MoE](https://github.com/Antlera/nanoGPT-moe) (that I tried) but a bit more complete, as I don't think it has anything to force expert utilization for example.

Is there a go-to repository for that or …

code ideas machinelearning moe project results simple small test training

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne