March 27, 2022, 8:47 p.m. | Dhruvil Karani

Towards Data Science - Medium towardsdatascience.com

Going beyond the theory and getting the best out of your K-Means clustering algorithm

Iterations in KMeans. Original GIF.

Table of contents

  1. What is KMeans?
    a. Python implementation
  2. Things to know before using KMeans
    a. K-Means cannot handle non-globular structure
    b. K-Means is sensitive to outliers
    c. Should you scale your data before using KMeans?
  3. How do you measure the performance of your model?

Note — The datasets used in the article are generated using sklearn’s make_blobs and make_moons methods …

clustering data analysis data science k-means k-means-clustering machine learning programming

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Enterprise Data Quality, Senior Analyst

@ Toyota North America | Plano

Data Analyst & Audit Management Software (AMS) Coordinator

@ World Vision | Philippines - Home Working

Product Manager Power BI Platform Tech I&E Operational Insights

@ ING | HBP (Amsterdam - Haarlerbergpark)

Sr. Director, Software Engineering, Clinical Data Strategy

@ Moderna | USA-Washington-Seattle-1099 Stewart Street

Data Engineer (Data as a Service)

@ Xplor | Atlanta, GA, United States