May 26, 2022, 10 a.m. | Top End Devs

Adventures in Machine Learning redcircle.com


Apache Spark is a lightning-fast unified analytics engine for large-scale data processing and machine learning. In this episode, Ben and Michael unpack Spark by ping-ponging questions and answers, supplemented by various examples applicable to machine learning workflows.


In this Episode…



  1. How does Spark work?

  2. What makes Apache Spark effective?

  3. Dot repartition in Spark

  4. Parallel processing systems

  5. What is an aggregation in Spark sequel?

  6. Analytics with Spark

  7. What is MPP?

  8. Testing for production

  9. Spark algorithms


Sponsors


apache apache spark integration ml platform spark

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Lead Software Engineer - Artificial Intelligence, LLM

@ OpenText | Hyderabad, TG, IN

Lead Software Engineer- Python Data Engineer

@ JPMorgan Chase & Co. | GLASGOW, LANARKSHIRE, United Kingdom

Data Analyst (m/w/d)

@ Collaboration Betters The World | Berlin, Germany

Data Engineer, Quality Assurance

@ Informa Group Plc. | Boulder, CO, United States

Director, Data Science - Marketing

@ Dropbox | Remote - Canada