April 30, 2024, 5:59 p.m. | MLOps.community

MLOps.community www.youtube.com

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/

Huge thank you to @nebiusofficial for sponsoring this episode. Nebius AI - https://nebius.ai/

MLOps podcast #228 with Simon Karasik, Machine Learning Engineer at Nebius AI, Handling Multi-Terabyte LLM Checkpoints.

// Abstract
The talk provides a gentle introduction to the topic of LLM checkpointing: why is it hard, how big are the checkpoints. It covers various tips and tricks for saving and loading multi-terabyte checkpoints, as …

abstract big cloud cloud storage engineer introduction llm loading machine machine learning machine learning engineer mlops mlops podcast podcast saving storage talk tips tricks

More from www.youtube.com / MLOps.community

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US