Feb. 26, 2024, 5:43 a.m. | Arshmeet Kaur, Morteza Sarmadi

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.14980v1 Announce Type: cross
Abstract: Rapid advancements in genome sequencing have led to the collection of vast amounts of genomics data. Researchers may be interested in using machine learning models on such data to predict the pathogenicity or clinical significance of a genetic mutation. However, many genetic datasets contain imbalanced target variables that pose challenges to machine learning models: observations are skewed/imbalanced in regression tasks or class-imbalanced in classification tasks. Genetic datasets are also often high-cardinal and contain skewed predictor …

abstract analysis arxiv classification collection comparative analysis cs.lg data data preprocessing feature feature selection genome genomics machine machine learning machine learning models performance q-bio.qm regression researchers sequencing stat.ml type vast

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Software Engineering Manager, Generative AI - Characters

@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States