Exploring the Latency/Throughput & Cost Space for LLM Inference // Timothée Lacroix // LLM 3 Talk 3 | allainews.com

Oct. 25, 2023, 10:13 a.m. | MLOps.community

MLOps.community www.youtube.com

// Abstract
Getting the right LLM inference stack means choosing the right model for your task, and running it on the right hardware, with proper inference code. This talk will go through popular inference stacks and set-ups, detailing what makes inference costly. We'll talk about the current generation of open-source models and how to make the best use of them, but we will also touch on features currently missing from the open-source serving stack as well as what the future …

abstract code cost hardware inference latency llm popular running set space stack stacks talk through ups

More from www.youtube.com / MLOps.community

The Mind Behind the AI Coding Assistant // Peter Guagenti // MLOps podcast #222 clip 20 hours ago | www.youtube.com

ai coding ai coding assistant assistant business +20

Streamlining Model Deployment // Daniel Lenton // AI in Production Talk 1 day ago | www.youtube.com

abstract aiaas ai companies ai infrastructure +21

LLMOps and GenAI at Enterprise Scale - Challenges and Opportunities // Andy McMahon // AI … 1 day ago | www.youtube.com

abstract andy challenges development +17

Data Labeling Best Practices // Charles Brecque // AI in Production Conference Lightning Talk 1 day ago | www.youtube.com

abstract best practices bio conference +17

Explaining ChatGPT to Anyone in 10 Minutes // Cameron Wolfe // AI in Production Conference 1 day ago | www.youtube.com

abstract become chatgpt conference +13

Handling Multi-Terabyte LLM Checkpoints // Simon Karasik // MLOps Podcast #228 1 day, 19 hours ago | www.youtube.com

abstract big cloud cloud storage +15

Leading Enterprise Data Teams // Sol Rashidi // MLOps Podcast #227 5 days, 16 hours ago | www.youtube.com

abstract building cases ceo +20

The Changing Face of AI Engineering // Amritha Arun Babu & Abhik Choudhury // Podcast … 6 days, 20 hours ago | www.youtube.com

ai engineer analytics build cases +18

Building Conversational AI Agents with Voice // Michelle Chan // AI in Production Conference 1 week ago | www.youtube.com

abstract agents ai agents baseten +21

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Business Data Scientist, gTech Ads

@ Google | Mexico City, CDMX, Mexico

View on ai-jobs.net

Lead, Data Analytics Operations

@ Zocdoc | Pune, Maharashtra, India

View on ai-jobs.net