Sept. 9, 2022, 5 p.m. | DataTalks.Club

DataTalks.Club datatalks.club

We talked about:



  • Christiaan’s background

  • Usual ways of collecting and curating data

  • Getting the buy-in from experts and executives

  • Starting an annotation booklet

  • Pre-labeling

  • Dataset collection

  • Human level baseline and feedback

  • Using the annotation booklet to boost annotation productivity

  • Putting yourself in the shoes of annotators (and measuring performance)

  • Active learning

  • Distance supervision

  • Weak labeling

  • Dataset collection in career positioning and project portfolios

  • IPython widgets

  • GDPR compliance and non-English NLP

  • Finding Christiaan online




Links:



  • My personal blog: https://useml.net/

  • Comtura, my …

curation dataset

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Program Control Data Analyst

@ Ford Motor Company | Mexico

Vice President, Business Intelligence / Data & Analytics

@ AlphaSense | Remote - United States