Web: http://arxiv.org/abs/2206.09107

June 23, 2022, 1:12 a.m. | Jianmin Chen, Robert H. Aseltine, Fei Wang, Kun Chen

stat.ML updates on arXiv.org arxiv.org

Statistical learning with a large number of rare binary features is commonly
encountered in analyzing electronic health records (EHR) data, especially in
the modeling of disease onset with prior medical diagnoses and procedures.
Dealing with the resulting highly sparse and large-scale binary feature matrix
is notoriously challenging as conventional methods may suffer from a lack of
power in testing and inconsistency in model fitting while machine learning
methods may suffer from the inability of producing interpretable results or
clinically-meaningful risk …

aggregation arxiv cross data electronic feature feature selection health lg logic records tree

