Splitwise improves GPU usage by splitting LLM inference phases

Jan. 4, 2024, 5:02 p.m. | Brenda Potts

Expanded LLM use creates new demands on cloud GPU capacity. Splitwise presents an efficient solution by separating the two essential phases of LLM inference, achieving higher throughput within a limited power budget.

The post Splitwise improves GPU usage by splitting LLM inference phases appeared first on Microsoft Research.

budget capacity cloud cloud gpu gpu inference llm microsoft microsoft research power research research blog solution usage

Visit resource

More from www.microsoft.com / Microsoft Research

Research Focus: Week of April 29, 2024 1 day, 11 hours ago | www.microsoft.com

april automated blind clip +17

Microsoft at ASPLOS 2024: Advancing hardware and software for high-scale, secure, and efficient modern applications 4 days, 8 hours ago | www.microsoft.com

advance applications architecture art +19

SIGMA: An open-source mixed-reality system for research on physical task assistance 4 days, 12 hours ago | www.microsoft.com

guidance innovation interactive intersection +9

Ideas: Exploring AI frontiers with Rafah Hosn 1 week, 1 day ago | www.microsoft.com

advancement disruption drive frontiers +13

SAMMO: A general-purpose framework for prompt optimization 2 weeks, 1 day ago | www.microsoft.com

framework general guide llms +8

Research Focus: Week of April 15, 2024 2 weeks, 2 days ago | www.microsoft.com

april cloud comet compression +15

Microsoft at NDSI 2024: Discoveries and implementations in networked systems 2 weeks, 3 days ago | www.microsoft.com

advances applications artificial artificial intelligence +17

Abstracts: April 16, 2024 2 weeks, 3 days ago | www.microsoft.com

april communication constellation devices +13

Ideas: Language technologies for everyone with Kalika Bali 3 weeks, 1 day ago | www.microsoft.com

career design her ideas +16

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)

@ takealot.com | Cape Town

View on ai-jobs.net

View more jobs

all AI news

Splitwise improves GPU usage by splitting LLM inference phases

More from www.microsoft.com / Microsoft Research

Jobs in AI, ML, Big Data

AI Engineer Intern, Agents

AI Research Scientist

Data Architect

Data ETL Engineer

Lead GNSS Data Scientist

Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)