[D] Preserving spatial distribution of data during data splitting | allainews.com

April 24, 2024, 5:14 p.m. | /u/dr_greg_mouse

Machine Learning www.reddit.com

Hello, I am trying to model nitrate concentrations in the streams in Bavaria in Germany using Random Forest model. I am using Python and primarily sklearn for the same. I have data from 490 water quality stations. I am following the methodology in the paper from LongzhuQ.Shen et al which can be found here: [https://www.nature.com/articles/s41597-020-0478-7](https://www.nature.com/articles/s41597-020-0478-7)

I want to split my dataset into training and testing set such that the spatial distribution of data in both sets is identical. The idea …

data dataset distribution machinelearning risk set spatial split testing training

More from www.reddit.com / Machine Learning

[D] How reliable is RAG currently? 5 hours ago | www.reddit.com

context context window documents machinelearning +5

[R] An Analysis of Linear Time Series Forecasting Models 8 hours ago | www.reddit.com

abstract analysis forecasting form +9

[D] The "it" in AI models is really just the dataset? 8 hours ago | www.reddit.com

ai models dataset machinelearning

[D] Analysis of Time To First Token (TTFT) of LLMs (10B-34B) 10 hours ago | www.reddit.com

analysis containers docker hey +10

[P] Open Source / Projects Based Machine Learning Community? 14 hours ago | www.reddit.com

building collaborations community devs +16

[R] DDPM for Timeseries Generation 15 hours ago | www.reddit.com

column data data generation dataset +13

[P] [D] Examples of client projects that you have delivered 16 hours ago | www.reddit.com

client consulting examples freelance +6

[D] is any traditional industry employee here can share if they are using gen ai … 17 hours ago | www.reddit.com

ai at work banking employee enterprises +6

[N] AI engineers report burnout and rushed rollouts as ‘rat race’ to stay competitive hits … 1 day, 2 hours ago | www.reddit.com

ai tools article artificial artificial intelligence +17

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net