May 5, 2023, 5:27 p.m. | /u/Busy-Pie-4468

Data Science www.reddit.com

I have a dataset wherein there are many transactions each associated with a company. The problem is that the dataset contains many labels that refer to the same company. E.g.,

Acme International Inc
Acme International Inc.
Acme Intl Inc
Acme Intl Inc., (Los Angeles)

I am looking for a way to preprocess my data such that all labels for the same company can be normalized to the same label (something like a "probabilistic foreign key"). I think this falls under …

data datascience dataset international labels preprocess transactions

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US