Oct. 4, 2022, 1:49 p.m. | Davide Romano

Towards Data Science - Medium towardsdatascience.com

Companion GitHub repository: Hands-on Great Expectations with Spark

Build a Data Quality workflow with Great Expectations and Spark

Photo by Markus Winkler on Unsplash

Introduction

At Mediaset, the Data Lake is a fundamental tool used daily by everyone who wants to get some company insights or activate data.

By definition, a Data Lake is “a centralized repository to store all your structured and unstructured data at any scale. You can store data natively from the source without having …

aws data data lake data quality data science great expectations health lake scale

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Software Engineer, Generative AI (C++)

@ SoundHound Inc. | Toronto, Canada