May 19, 2022, 11:05 a.m. | /u/triary95

Machine Learning www.reddit.com

I have a very limited biological dataset with the dimensions like 200 x 700. I am making regression models with 170 train data and 30 in test data. Main the cross validated models in caret give me performance that are about 50% lower( r2, RMSE mae) than when i test my model on the test set. I understand that that this might be due to bad data split with the test set having 'simpler' samples and the data is obviously …

machinelearning metrics set test

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Intelligence Analyst

@ Rappi | COL-Bogotá

Applied Scientist II

@ Microsoft | Redmond, Washington, United States