Feb. 6, 2024, 5:47 a.m. | Tom Blau Iadine Chades Amir Dezfouli Daniel Steinberg Edwin V. Bonilla

cs.LG updates on arXiv.org arxiv.org

Reinforcement learning can learn amortised design policies for designing sequences of experiments. However, current amortised methods rely on estimators of expected information gain (EIG) that require an exponential number of samples on the magnitude of the EIG to achieve an unbiased estimation. We propose the use of an alternative estimator based on the cross-entropy of the joint model distribution and a flexible proposal distribution. This proposal distribution approximates the true posterior of the model parameters given the experimental history and …

bayesian cross-entropy cs.lg current design designing entropy experiment information learn reinforcement reinforcement learning samples stat.me unbiased via

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne