Feb. 6, 2022, 1:29 p.m. | /u/Zman420

Machine Learning www.reddit.com

Hi all,

So say I have a binary classification problem with an unbalanced dataset, where the positive class is about 1/4 as prevalent as the negative class. I'm using keras, about 500 features, training set has 2.8M neg examples and about 700k pos examples. Mainly looking at F1 and precision as my metrics.

For training, I use under-sampling to balance the dataset out, so that it becomes about 1.4M examples of balanced data. The validation (and testing) sets I obviously …

dataset keras machinelearning precision

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior AI & Data Engineer

@ Bertelsmann | Kuala Lumpur, 14, MY, 50400

Analytics Engineer

@ Reverse Tech | Philippines - Remote