all AI news
Finding Outliers in Gaussian Model-Based Clustering
April 8, 2024, 4:45 a.m. | Katharine M. Clark, Paul D. McNicholas
stat.ML updates on arXiv.org arxiv.org
Abstract: Clustering, or unsupervised classification, is a task often plagued by outliers. Yet there is a paucity of work on handling outliers in clustering. Outlier identification algorithms tend to fall into three broad categories: outlier inclusion, outlier trimming, and \textit{post hoc} outlier identification methods, with the former two often requiring pre-specification of the number of outliers. The fact that sample Mahalanobis distance is beta-distributed is used to derive an approximate distribution for the log-likelihoods of subset …
abstract algorithms arxiv classification clustering identification inclusion outlier outliers stat.me stat.ml type unsupervised work
More from arxiv.org / stat.ML updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Associate Data Engineer
@ Nominet | Oxford/ Hybrid, GB
Data Science Senior Associate
@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India