Aug. 22, 2022, 9:24 p.m. | /u/ratatouille_artist

Machine Learning www.reddit.com

I don't think we talk enough about how to create good data. Models and even deployments are more sexy than data creation, let's change this! Here's my conceptual framework for data creation:

​

1. define success in business terms
2. map the data with stakeholders for buy in
3. rapid prototype from data to deployment
4. iterate on dataset creation

Techniques to get more data bang for buck:

​

1. weak supervision
2. active learning

Considerations around dataset creation:

​ …

data data creation machinelearning

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York