Jan. 9, 2022, 6:43 a.m. | /u/onzie9

Data Science www.reddit.com

Task: I have to write a regression model. The inputs to the model are 11-element vectors that are themselves derived from different sized cohorts. The cohorts generally have 500-2000 individual users.

Problem: I don't have enough data. I have maybe 500 of these cohorts.

Potential solution: Once the cohorts come into our pipeline, the individuals' data are available, including what cohort they came from. Each cohort is defined by 4 parameters (e.g. country, age range, gender, income range). So my …

data datascience mistakes synthetic data

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote