Web: https://www.reddit.com/r/datascience/comments/shf2yc/any_good_book_that_talks_specifically_about/

Jan. 31, 2022, 11:02 p.m. | /u/carusGOAT

Data Science reddit.com

I am looking for a book that talks specifically about the problems encountered when dealing with massive datasets and the strategies/algorithms/tools developed to 1. feasibly process the dataset and 2. manage or organize your time effectively as a data scientist

I am looking for things like a discussion of data processing engines (MapReduce, Hadoop, and Spark), the purpose of creating data pipelines (Airflow, Luigi), etc.

I looked on the DS Book Megathread but from the titles, I did not seem …

about book datascience datasets good

Director, Data Engineering and Architecture

@ Chainalysis | California | New York | Washington DC | Remote - USA

Deep Learning Researcher

@ Topaz Labs | Dallas, TX

Sr Data Engineer (Contractor)

@ SADA | US - West

Senior Cloud Database Administrator

@ Findhelp | Remote

Senior Data Analyst

@ System1 | Remote

Speech Machine Learning Research Engineer

@ Samsung Research America | Mountain View, CA