Feb. 26, 2024, 5:43 a.m. | Arshmeet Kaur, Morteza Sarmadi

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.14980v1 Announce Type: cross
Abstract: Rapid advancements in genome sequencing have led to the collection of vast amounts of genomics data. Researchers may be interested in using machine learning models on such data to predict the pathogenicity or clinical significance of a genetic mutation. However, many genetic datasets contain imbalanced target variables that pose challenges to machine learning models: observations are skewed/imbalanced in regression tasks or class-imbalanced in classification tasks. Genetic datasets are also often high-cardinal and contain skewed predictor …

abstract analysis arxiv classification collection comparative analysis cs.lg data data preprocessing feature feature selection genome genomics machine machine learning machine learning models performance q-bio.qm regression researchers sequencing stat.ml type vast

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US