Feb. 7, 2024, 9:08 p.m. | /u/UpvoteBeast

Machine Learning www.reddit.com

I've been wrestling with the common issue of feeding my PyTorch ML pipelines directly from cloud storage options like S3 instead of EFS. While it's a convenient setup, the performance hit can be quite discouraging, especially when dealing with large datasets or needing speedy iterations for model training and evaluation.

I stumbled upon [this guide](https://cuno.io/blog/optimizing-pytorch-machine-learning-cost-and-performance-using-cunofs/) that talks about optimizing PyTorch performance while reducing costs.

Would love to hear your thoughts on this or if anyone has tried this in their …

cloud cloud storage cloud storage solutions costs datasets evaluation issue large datasets machinelearning ml pipelines performance pipelines pytorch setup solutions storage training

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Data Engineer (m/f/d)

@ Project A Ventures | Berlin, Germany

Principle Research Scientist

@ Analog Devices | US, MA, Boston