Dec. 1, 2023, 6:59 a.m. | Mikhail Sarafanov

Towards Data Science - Medium towardsdatascience.com

And how to utilize it effectively in XGBoost model

Preview image (by author)

Recently, my colleagues and I have been working on a big high-loaded service that utilizes the Xgboost machine learning model and Dask as the tool for distributed data processing and forecast generating. Here I would like to share findings that we have been able to maximize the use of Dask for the purpose of data preparation and ML model fitting.

What is Dask?

Dask is a library …

author big colleagues dask data dataframes data processing distributed distributed computing distributed data everything forecast image machine machine learning machine learning model parallel-computing processing programming python service tool xgboost

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Data Analyst, Client Insights and Analytics - New Graduate, Full Time

@ Scotiabank | Toronto, ON, CA

Consultant Senior Data Scientist (H/F)

@ Publicis Groupe | Paris, France

Data Analyst H/F - CDI

@ Octapharma | Lingolsheim, FR

Lead AI Engineer

@ Ford Motor Company | United States

Senior Staff Machine Learning Engineer

@ Warner Bros. Discovery | CA San Francisco 153 Kearny Street