Feb. 5, 2024, 3:44 p.m. | Minwook Kim Juseong Kim Ki Beom Kim Giltae Song

cs.LG updates on arXiv.org arxiv.org

Self-training has gained attraction because of its simplicity and versatility, yet it is vulnerable to noisy pseudo-labels caused by erroneous confidence. Several solutions have been proposed to handle the problem, but they require significant modifications in self-training algorithms or model architecture, and most have limited applicability in tabular domains. To address this issue, we explore a novel direction of reliable confidence in self-training contexts and conclude that the confidence, which represents the value of the pseudo-label, should be aware of …

algorithms architecture cluster confidence cs.lg data domains issue labels self-training simplicity solutions tabular tabular data training vulnerable

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Data Scientist

@ ITE Management | New York City, United States