Oct. 14, 2022, 5:56 p.m. | Robin Linacre

Towards Data Science - Medium towardsdatascience.com

How unsupervised learning is used to estimate model parameters in Splink

Photo by Suzanne D. Williams on Unsplash

Splink is a free probabilistic record linkage library that predicts the likelihood that two records refer to the same entity. For example, what is the probability that the following two records match?

Example pairwise record comparison

The underlying statistical model is called the Fellegi Sunter model. It works by computing partial match weights, which are a measure of the importance of the …

data science entity-resolution fuzzy-matching intuition

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne