Feb. 12, 2024, 5:42 a.m. | Naiqing Guan Nick Koudas

cs.LG updates on arXiv.org arxiv.org

Modern machine learning models require large labelled datasets to achieve good performance, but manually labelling large datasets is expensive and time-consuming. The data programming paradigm enables users to label large datasets efficiently but produces noisy labels, which deteriorates the downstream model's performance. The active learning paradigm, on the other hand, can acquire accurate labels but only for a small fraction of instances. In this paper, we propose ActiveDP, an interactive framework bridging active learning and data programming together to generate …

active learning cs.db cs.lg data datasets good labelling labels large datasets machine machine learning machine learning models modern paradigm performance programming s performance

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

HPC Engineer (x/f/m) - DACH

@ Meshcapade GmbH | Remote, Germany

Data Architect

@ Dyson | India - Bengaluru IT Capability Centre

GTM Operation and Marketing Data Analyst

@ DataVisor | Toronto, Ontario, Canada - Remote

Associate - Strategy & Business Intelligence

@ Hitachi | (HE)Office Rotterdam

Senior Executive - Data Analysis

@ Publicis Groupe | Beirut, Lebanon