April 17, 2024, 4:43 a.m. | Chris Cundy, Rishi Desai, Stefano Ermon

cs.LG updates on arXiv.org arxiv.org

arXiv:2012.15019v3 Announce Type: replace
Abstract: As reinforcement learning techniques are increasingly applied to real-world decision problems, attention has turned to how these algorithms use potentially sensitive information. We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions. We give examples of how this setting covers real-world problems in privacy for sequential decision-making. We solve this problem in the policy gradients framework by introducing a regularizer based on the …

abstract algorithms arxiv attention cs.cr cs.lg decision examples information policies policy privacy reinforcement reinforcement learning state through training type variables via world

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Analyst

@ Alstom | Johannesburg, GT, ZA