all AI news
Microsoft & NVIDIA Leverage DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest Monolithic Language Model
Synced syncedreview.com
A research team from Microsoft and NVIDIA leverages the NVIDIA Megatron-LM and Microsoft’s DeepSpeed to create an efficient and scalable 3D parallel system that combines data, pipeline, and tensor-slicing based parallelism, achieving superior zero-, one-, and few-shot learning accuracies and new state-of-the-art results on NLP benchmarks.
The post Microsoft & NVIDIA Leverage DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest Monolithic Language Model first appeared on Synced.
ai artificial intelligence language language model machine learning machine learning & data science microsoft ml natural-language understanding nature language tech neural networks nlg nvidia parallel-computing research technology transformer turing