Dec. 19, 2023, 11:15 p.m. | /u/semicausal

Machine Learning www.reddit.com

Hey r/MachineLearning!

My coworkers worked at Apple on the ML compute platform team and constantly found themselves supporting ML engineers with their large, distributed ML training jobs. ML engineers had to either use less data or they had to rewrite the training jobs to weave in more complicated *data chunking*. They also struggled to keep GPU utilization above 80% because so much time was spent waiting for data to just load: [https://discuss.pytorch.org/t/how-to-load-all-data-into-gpu-for-training/27609](https://discuss.pytorch.org/t/how-to-load-all-data-into-gpu-for-training/27609)

Inspired by the pains of that experience, they …

code data datasets deploy experience hey inside iterate job kubernetes large datasets library machinelearning open source plugin speed training

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US