May 3, 2022, 3:35 p.m. | /u/AlopexLagopus3

Data Science www.reddit.com

I'm working on picking up a machine learning pipeline that someone else has written. Here's a summary of what I'm dealing with:

* Pipeline is ~50 Python scripts, split across two computers. The pipeline requires bouncing back and forth between both computers (part GPU, part CPU; this can eventually be fixed).
* There is no automation - each script was previously being invoked by individual commands.
* There is no organization. The script names are things like "step_1_b_run_before" "step_1_preprocess_a".
* …

code datascience job pipeline

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Data Engineer

@ Bosch Group | San Luis Potosí, Mexico

DATA Engineer (H/F)

@ Renault Group | FR REN RSAS - Le Plessis-Robinson (Siège)

Advisor, Data engineering

@ Desjardins | 1, Complexe Desjardins, Montréal

Data Engineer Intern

@ Getinge | Wayne, NJ, US

Software Engineer III- Java / Python / Pyspark / ETL

@ JPMorgan Chase & Co. | Jersey City, NJ, United States