Jan. 10, 2022, 4:14 p.m. | Prerna Singh

Towards Data Science - Medium towardsdatascience.com

Image by author

Introduction

We can define data leakage as:

“When data set contains relevant data, but similar data is not obtainable when the models are used for predictions, data leakage (or leaking) occurs. This results in great success on the training dataset (and possibly even the validation accuracy), but lack of performance in production.”

Data leakage, or merely leaking, is a term used during machine learning to describe the situation in which the data used to teach a machine-learning …

data data preprocessing data science deep learning learning machine machine learning predictive analytics

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analytics & Insight Specialist, Customer Success

@ Fortinet | Ottawa, ON, Canada

Account Director, ChatGPT Enterprise - Majors

@ OpenAI | Remote - Paris