Dec. 1, 2023, 6:59 a.m. | Mikhail Sarafanov

Towards Data Science - Medium towardsdatascience.com

And how to utilize it effectively in XGBoost model

Preview image (by author)

Recently, my colleagues and I have been working on a big high-loaded service that utilizes the Xgboost machine learning model and Dask as the tool for distributed data processing and forecast generating. Here I would like to share findings that we have been able to maximize the use of Dask for the purpose of data preparation and ML model fitting.

What is Dask?

Dask is a library …

author big colleagues dask data dataframes data processing distributed distributed computing distributed data everything forecast image machine machine learning machine learning model parallel-computing processing programming python service tool xgboost

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Robotics Technician - 3rd Shift

@ GXO Logistics | Perris, CA, US, 92571