April 1, 2024, 1:57 p.m. | /u/bernful

Data Science www.reddit.com

I am attempting to cluster stores based off their sales. I can either do:



1. Univariate K-Means clustering by way of the Ckmeans.1d.dp package in R. This works perfectly fine, only 2 cons are figuring out the upper limit on K, and possibly explainability to the client.
2. Fixed cluster boundaries. In this case, I average the sales of all stores, and create boundaries like: 50% below average, 25% below average, 25% above average, 50% above average. This is …

client cluster clustering cons datascience explainability k-means package sales stores

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US