d
Nov. 1, 2023, 1:06 p.m. | David Stutz

Blog Archives • David Stutz davidstutz.de

In supervised machine learning, we usually assume access to ground truth label for evaluation. In many applications, however, these ground truth labels are derived from expert opinions. Disagreement among these experts is typically ignored using simple majority voting or averaging. Unfortunately, this can have severe consequences by over-estimating performance or mis-guiding model selection. In our work presented in this article, we tackle this problem by introducing a statistical framework for aggregating expert opinions.


The post ArXiv Pre-Print “Evaluating AI Systems …

ai systems applications arxiv blog case case study computer vision consequences dermatology evaluation expert experts health labels machine machine learning opinions publication simple study supervised machine learning systems uncertain uncertainty-estimation voting

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Software Engineer, Data Tools - Full Stack

@ DoorDash | Pune, India

Senior Data Analyst

@ Artsy | New York City