April 4, 2024, 6:42 p.m. | /u/o-rka

Data Science www.reddit.com

For instance say I have 1000 features that I cluster with algorithm A. I obtain another 500 features, I would like to use the existing cluster information without reclustering everything from the start.

Is there a clustering algorithm (ideally in sklearn and not k-means) that can handle this type of usage?

In one use case, the distance metric I plan on using will be jaccard since my data will be binary.

algorithm cluster clustering clustering algorithm datascience everything features information instance k-means sklearn update

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst (Digital Business Analyst)

@ Activate Interactive Pte Ltd | Singapore, Central Singapore, Singapore