Jan. 14, 2022, 4:48 p.m. | /u/iamquah

Machine Learning www.reddit.com

Hey all!

I'm curious to know what everyone does their data processing in? For industry purposes on smaller datasets I've used Pandas and sklearn, while for larger ones I've used Dask.

  • I know that Tensorflow and Pytorch have their own dataloader frameworks but does it scale to large datasets? Say 100GB++?

  • Do people do data transformations in Jax?

  • if you had infinite time how would you (re)do your company tech stack?

My data is 85% timeseries, 10% images, and 5% …

data data processing datasets framework machinelearning processing small

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst (H/F)

@ Business & Decision | Montpellier, France

Machine Learning Researcher

@ VERSES | Brighton, England, United Kingdom - Remote