all AI news
From Large to Small Datasets: Size Generalization for Clustering Algorithm Selection
Feb. 23, 2024, 5:42 a.m. | Vaggos Chatziafratis, Ishani Karmarkar, Ellen Vitercik
cs.LG updates on arXiv.org arxiv.org
Abstract: In clustering algorithm selection, we are given a massive dataset and must efficiently select which clustering algorithm to use. We study this problem in a semi-supervised setting, with an unknown ground-truth clustering that we can only access through expensive oracle queries. Ideally, the clustering algorithm's output will be structurally close to the ground truth. We approach this problem by introducing a notion of size generalization for clustering algorithm accuracy. We identify conditions under which we …
abstract algorithm arxiv clustering clustering algorithm cs.lg dataset datasets ground-truth massive oracle queries semi-supervised small stat.ml study through truth type
More from arxiv.org / cs.LG updates on arXiv.org
The Perception-Robustness Tradeoff in Deterministic Image Restoration
2 days, 6 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne