all AI news
Comparative Analysis of Data Preprocessing Methods, Feature Selection Techniques and Machine Learning Models for Improved Classification and Regression Performance on Imbalanced Genetic Data
Feb. 26, 2024, 5:43 a.m. | Arshmeet Kaur, Morteza Sarmadi
cs.LG updates on arXiv.org arxiv.org
Abstract: Rapid advancements in genome sequencing have led to the collection of vast amounts of genomics data. Researchers may be interested in using machine learning models on such data to predict the pathogenicity or clinical significance of a genetic mutation. However, many genetic datasets contain imbalanced target variables that pose challenges to machine learning models: observations are skewed/imbalanced in regression tasks or class-imbalanced in classification tasks. Genetic datasets are also often high-cardinal and contain skewed predictor …
abstract analysis arxiv classification collection comparative analysis cs.lg data data preprocessing feature feature selection genome genomics machine machine learning machine learning models performance q-bio.qm regression researchers sequencing stat.ml type vast
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Software Engineering Manager, Generative AI - Characters
@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA
Senior Operations Research Analyst / Predictive Modeler
@ LinQuest | Colorado Springs, Colorado, United States