May 26, 2022, 10 a.m. | Ben Wilson, Michael Berk

Adventures in Machine Learning redcircle.com


Apache Spark is a lightning-fast unified analytics engine for large-scale data processing and machine learning. In this episode, Ben and Michael unpack Spark by ping-ponging questions and answers, supplemented by various examples applicable to machine learning workflows.


In this Episode…



  1. How does Spark work?

  2. What makes Apache Spark effective?

  3. Dot repartition in Spark

  4. Parallel processing systems

  5. What is an aggregation in Spark sequel?

  6. Analytics with Spark

  7. What is MPP?

  8. Testing for production

  9. Spark algorithms


Sponsors


analytics analytics engine apache apache spark data data processing examples integration lightning machine machine learning platform processing questions scale spark unified analytics work workflows

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York