Aug. 10, 2023, 4:43 a.m. | Lukas Prediger, Joonas Jälkö, Antti Honkela, Samuel Kaski

cs.LG updates on arXiv.org arxiv.org

Consider a setting where multiple parties holding sensitive data aim to
collaboratively learn population level statistics, but pooling the sensitive
data sets is not possible. We propose a framework in which each party shares a
differentially private synthetic twin of their data. We study the feasibility
of combining such synthetic twin data sets for collaborative learning on
real-world health data from the UK Biobank. We discover that parties engaging
in the collaborative learning via shared synthetic data obtain more accurate …

aim arxiv collaborative data data sets distributed distributed data framework learn multiple pooling population shares statistics study synthetic

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Analyst

@ Alstom | Johannesburg, GT, ZA