Web: https://www.reddit.com/r/datascience/comments/xgn7tl/ml_for_string_matching_when_theres_no_any/

Sept. 17, 2022, 1:57 p.m. | /u/devwander1

Data Science reddit.com


I need your help with a problem that I encountered recently.

I have a set of invoices with some products on them. The problem is that those products are not listed with the same name in the database (they have a different name there). The database names are the true labels.

How to make the system flags automatically "Apl Juice 0.5L" when it encounters "Apple 500ml" for example?

I tried Levenshtein distance as similarity metric, but for other complicated …

datascience relationship semantic string

Postdoctoral Fellow: ML for autonomous materials discovery

@ Lawrence Berkeley National Lab | Berkeley, CA

Research Scientists

@ ODU Research Foundation | Norfolk, Virginia

Embedded Systems Engineer (Robotics)

@ Neo Cybernetica | Bedford, New Hampshire

2023 Luis J. Alvarez and Admiral Grace M. Hopper Postdoc Fellowship in Computing Sciences

@ Lawrence Berkeley National Lab | San Francisco, CA

Senior Manager Data Scientist

@ NAV | Remote, US

Senior AI Research Scientist

@ Earth Species Project | Remote anywhere

Research Fellow- Center for Security and Emerging Technology (Multiple Opportunities)

@ University of California Davis | Washington, DC

Staff Fellow - Data Scientist

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Staff Fellow - Senior Data Engineer

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Software Engineer II, Data Pipeline

@ Amplitude | San Francisco, CA

Data Operations Researcher

@ Cognism | Skopje, Greater Skopje, North Macedonia

Data Engineer - Commodities

@ DRW | Chicago and London