Jan. 1, 2023, midnight | Noirrit Kiran Chandra, Antonio Canale, David B. Dunson

JMLR www.jmlr.org

Bayesian mixture models are widely used for clustering of high-dimensional data with appropriate uncertainty quantification. However, as the dimension of the observations increases, posterior inference often tends to favor too many or too few clusters. This article explains this behavior by studying the random partition posterior in a non-standard setting with a fixed sample size and increasing data dimensionality. We provide conditions under which the finite sample posterior tends to either assign every observation to a different cluster or all …

article bayesian behavior clustering data dimensionality inference posterior quantification random standard studying the curse of dimensionality uncertainty

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

Sr. Data Operations

@ Carousell Group | West Jakarta, Indonesia

Senior Analyst, Business Intelligence & Reporting

@ Deutsche Bank | Bucharest

Business Intelligence Subject Matter Expert (SME) - Assistant Vice President

@ Deutsche Bank | Cary, 3000 CentreGreen Way

Enterprise Business Intelligence Specialist

@ NAIC | Kansas City

Senior Business Intelligence (BI) Developer - Associate

@ Deutsche Bank | Cary, 3000 CentreGreen Way