Nov. 7, 2023, 5:56 p.m. | Andrew Hoblitzell

InfoQ - AI, ML & Data Engineering www.infoq.com

Microsoft has announced the alpha release of DeepSpeed-FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpeed-FastGen is based on the Dynamic SplitFuse technique. The system currently supports several model architectures.

By Andrew Hoblitzell

ai alpha andrew architectures deep learning deepspeed deployment development dynamic gpu inference language language models large language large language models llms machine learning microsoft ml & data engineering release releases text text generation

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote