Microsoft releases DeepSpeed-FastGen for High-Throughput Text Generation | allainews.com

Nov. 7, 2023, 5:56 p.m. | Andrew Hoblitzell

InfoQ - AI, ML & Data Engineering www.infoq.com

Microsoft has announced the alpha release of DeepSpeed-FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpeed-FastGen is based on the Dynamic SplitFuse technique. The system currently supports several model architectures.

By Andrew Hoblitzell

ai alpha andrew architectures deep learning deepspeed deployment development dynamic gpu inference language language models large language large language models llms machine learning microsoft ml & data engineering release releases text text generation

More from www.infoq.com / InfoQ - AI, ML & Data Engineering

Podcast: Deepthi Sigireddi on Distributed Database Architecture in the Cloud Native Era 10 hours ago | www.infoq.com

ai architecture architecture & design cloud +19

Java News Roundup: OpenJDK Updates, Piranha Cloud, Spring Data 2024.0.0, GlassFish, Micrometer 19 hours ago | www.infoq.com

ai apache tomcat architecture & design cloud +27

Google Launches Gemini 1.5 Flash for Lower-Latency and More Efficient AI Serving 1 day, 4 hours ago | www.infoq.com

ai ai models context context window +15

Uber Migrates 1 Trillion Records from DynamoDB to LedgerStore to Save $6 Million Annually 1 day, 15 hours ago | www.infoq.com

ai architecture & design aws big data +23

Enhanced Security for Enterprises: Google Launches Google Threat Intelligence 1 day, 16 hours ago | www.infoq.com

ai analysis cloud cloud security +24

Rider 2024.1: New Monitoring Tool Window, Collection Vizualizer, .NET Aspire, AI Assistant Plugin 2 days, 1 hour ago | www.infoq.com

ai ai assistant architecture & design artificial intelligence +17

Presentation: Streaming Databases: Embracing the Convergence of Stream Processing and Databases 3 days, 12 hours ago | www.infoq.com

ai convergence database databases +14

Hugging Face Unveils LeRobot, an Open-Source Machine Learning Model for Robotics 4 days, 10 hours ago | www.infoq.com

advanced ai applications daniel +17

Apple Open-Sources One Billion Parameter Language Model OpenELM 6 days, 8 hours ago | www.infoq.com

ai anthony apple attention +12

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net