Web: https://www.reddit.com/r/datascience/comments/shf2yc/any_good_book_that_talks_specifically_about/

Jan. 31, 2022, 11:02 p.m. | /u/carusGOAT

Data Science reddit.com

I am looking for a book that talks specifically about the problems encountered when dealing with massive datasets and the strategies/algorithms/tools developed to 1. feasibly process the dataset and 2. manage or organize your time effectively as a data scientist

I am looking for things like a discussion of data processing engines (MapReduce, Hadoop, and Spark), the purpose of creating data pipelines (Airflow, Luigi), etc.

I looked on the DS Book Megathread but from the titles, I did not seem …

about book datascience datasets good

