all AI news
[R] LoRA-MoE: Training and inferencing MoE models like Mixtral 8x7B like a 7B param model
Feb. 10, 2024, 3:39 p.m. | /u/ashz8888
Machine Learning www.reddit.com
Since only two of the groups need to be loaded to the memory, while others remain offloaded, this requires the model to use 12.9B parameters out of the 46.7B total parameters at any point.
I'm wondering to bring the parameters down to almost the same level …
experts inferencing lora machinelearning mixtral mixtral 8x7b moe network processing token training
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Tableau/PowerBI Developer (A.Con)
@ KPMG India | Bengaluru, Karnataka, India
Software Engineer, Backend - Data Platform (Big Data Infra)
@ Benchling | San Francisco, CA