Feb. 14, 2024, 6:12 a.m. | Andrew Skabar, PhD

Towards Data Science - Medium towardsdatascience.com

Evaluating Synthetic Data — The Million Dollar Question

Are my real and synthetic datasets random samples from the same parent distribution?

Photo by Edge2Edge Media on Unsplash

When we perform synthetic data generation, we typically create a model for our real (or ‘observed’) data, and then use this model to generate synthetic data. This observed data is usually compiled from real world experiences, such as measurements of the physical characteristics of irises or details about individuals who have defaulted on …

data science hands-on-tutorials statistics synthetic data synthetic-data-generation

