June 27, 2022, 1:12 a.m. | Hongxin Wei, Lue Tao, Renchunzi Xie, Lei Feng, Bo An

cs.CV updates on arXiv.org arxiv.org

Deep neural networks usually perform poorly when the training dataset suffers
from extreme class imbalance. Recent studies found that directly training with
out-of-distribution data (i.e., open-set samples) in a semi-supervised manner
would harm the generalization performance. In this work, we theoretically show
that out-of-distribution data can still be leveraged to augment the minority
classes from a Bayesian perspective. Based on this motivation, we propose a
novel method called Open-sampling, which utilizes open-set noisy labels to
re-balance the class priors of …

arxiv data datasets distribution lg sampling

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Principal Machine Learning Engineer (AI, NLP, LLM, Generative AI)

@ Palo Alto Networks | Santa Clara, CA, United States

Consultant Senior Data Engineer F/H

@ Devoteam | Nantes, France