July 27, 2023, 1:35 a.m. | /u/son_of_tv_c

Data Science www.reddit.com

So here's a high level overview of the problem: I have data from a retail store chain, product skus as columns, customers as rows, the number of times that customer has purchased each product at the intersection. The dataset is both sparse and wide. The task is to cluster users based on their product purchasing history.

Based on some recommendations from here, I first applied max abs scaling, then into UMAP to reduce dimensionality, then GMM to cluster, much better …

algorithm business cluster clustering clustering algorithm customer customers data datascience dataset intersection overview product retail together

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Business Intelligence Architect - Specialist

@ Eastman | Hyderabad, IN, 500 008