all AI news
DeepSpeed: Advancing MoE inference and training to power next-generation AI scale
Jan. 19, 2022, 5:19 p.m. | Alyssa Hughes
Microsoft Research www.microsoft.com
In the last three years, the largest trained dense models have increased in size by over 1,000 times, from a few hundred million parameters to over 500 billion parameters in Megatron-Turing NLG 530B (MT-NLG). Improvements in model quality with size suggest that this trend will continue, with larger model sizes bringing better model quality. However, […]
The post DeepSpeed: Advancing MoE inference and training to power next-generation AI scale appeared first on Microsoft Research.
More from www.microsoft.com / Microsoft Research
SAMMO: A general-purpose framework for prompt optimization
6 days, 13 hours ago |
www.microsoft.com
Abstracts: April 16, 2024
1 week, 1 day ago |
www.microsoft.com
Ideas: Language technologies for everyone with Kalika Bali
1 week, 6 days ago |
www.microsoft.com
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Engineer
@ Parker | New York City
Sr. Data Analyst | Home Solutions
@ Three Ships | Raleigh or Charlotte, NC