all AI news
Microsoft releases DeepSpeed-FastGen for High-Throughput Text Generation
InfoQ - AI, ML & Data Engineering www.infoq.com
Microsoft has announced the alpha release of DeepSpeed-FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpeed-FastGen is based on the Dynamic SplitFuse technique. The system currently supports several model architectures.
By Andrew Hoblitzellai alpha andrew architectures deep learning deepspeed deployment development dynamic gpu inference language language models large language large language models llms machine learning microsoft ml & data engineering release releases text text generation