[P] Kubernetes plugin for mounting datasets to speed up model training | allainews.com

Dec. 19, 2023, 11:15 p.m. | /u/semicausal

Machine Learning www.reddit.com

Hey r/MachineLearning!

My coworkers worked at Apple on the ML compute platform team and constantly found themselves supporting ML engineers with their large, distributed ML training jobs. ML engineers had to either use less data or they had to rewrite the training jobs to weave in more complicated *data chunking*. They also struggled to keep GPU utilization above 80% because so much time was spent waiting for data to just load: [https://discuss.pytorch.org/t/how-to-load-all-data-into-gpu-for-training/27609](https://discuss.pytorch.org/t/how-to-load-all-data-into-gpu-for-training/27609)

Inspired by the pains of that experience, they …

code data datasets deploy experience hey inside iterate job kubernetes large datasets library machinelearning open source plugin speed training

More from www.reddit.com / Machine Learning

[D] PEFT techniques actually used in the industry 4 hours ago | www.reddit.com

industry machinelearning normally peft +2

[D] Can anyone with the expertise speak to the overlap, or not, between Nvidia's hardware … 5 hours ago | www.reddit.com

apple chips expertise hardware +4

[P] Skyrim - Open-source model zoo for Large Weather Models 6 hours ago | www.reddit.com

ai models building capabilities fine-tuning +7

[P] Identify toxic underwater air bubbles lurking in the substrate with aquatic ultrasonic scans via … 8 hours ago | www.reddit.com

arduino classification color identify +11

[P] YARI - Yet Another RAG Implementation. Hybrid context retrieval 9 hours ago | www.reddit.com

api context cosine embedding +14

[D] Is EOS token crucial during pre-training? 13 hours ago | www.reddit.com

documents eos flow information +7

[D] Stack Overflow partnership with OPEN AI 14 hours ago | www.reddit.com

access chart chat chat gpt +16

[D] How does fast inference work with state of the art LLMs? 16 hours ago | www.reddit.com

70b art gpt gpt-4 +11

[D] Llama 3 Monstrosities 1 day, 8 hours ago | www.reddit.com

create easy life llama +4

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net