all AI news
Handling Multi-Terabyte LLM Checkpoints // Simon Karasik // MLOps Podcast #228
April 30, 2024, 5:59 p.m. | MLOps.community
MLOps.community www.youtube.com
Huge thank you to @nebiusofficial for sponsoring this episode. Nebius AI - https://nebius.ai/
MLOps podcast #228 with Simon Karasik, Machine Learning Engineer at Nebius AI, Handling Multi-Terabyte LLM Checkpoints.
// Abstract
The talk provides a gentle introduction to the topic of LLM checkpointing: why is it hard, how big are the checkpoints. It covers various tips and tricks for saving and loading multi-terabyte checkpoints, as …
abstract big cloud cloud storage engineer introduction llm loading machine machine learning machine learning engineer mlops mlops podcast podcast saving storage talk tips tricks
More from www.youtube.com / MLOps.community
Retrieval Augmented Generation // Syed Asad // MLOps Podcast #233
3 days, 17 hours ago |
www.youtube.com
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US