May 22, 2022, 9:53 a.m. | /u/Rafaelkoll

Machine Learning www.reddit.com

I have a dataset containing multiple cancer cell lines (rows). The dataset features/columns are different genes, where each gene has a value based on their PDUI (polyadenylation site usage index) score (between 0 and 1). The dataset has 53 rows (cell lines) and 12,500 columns (genes).

For each cell line, I also have an IC50 value (taken from a different dataset) which shows resistance/sensitivity to an anti-cancer drug being developed. I would like to cluster the cell lines considering **all …

clustering machinelearning

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Technology Consultant Master Data Management (w/m/d)

@ SAP | Walldorf, DE, 69190

Research Engineer, Computer Vision, Google Research

@ Google | Nairobi, Kenya