Sept. 16, 2022, 1:12 a.m. | Sahand Rezaei-Shoshtari, Rosie Zhao, Prakash Panangaden, David Meger, Doina Precup

cs.LG updates on arXiv.org arxiv.org

Abstraction has been widely studied as a way to improve the efficiency and
generalization of reinforcement learning algorithms. In this paper, we study
abstraction in the continuous-control setting. We extend the definition of MDP
homomorphisms to encompass continuous actions in continuous state spaces. We
derive a policy gradient theorem on the abstract MDP, which allows us to
leverage approximate symmetries of the environment for policy optimization.
Based on this theorem, we propose an actor-critic algorithm that is able to
learn …

arxiv continuous gradient policy

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Automated Greenhouse Expert - Phenotyping & Data Analysis (all genders)

@ Bayer | Frankfurt a.M., Hessen, DE

Machine Learning Scientist II

@ Expedia Group | India - Bengaluru

Data Engineer/Senior Data Engineer, Bioinformatics

@ Flagship Pioneering, Inc. | Cambridge, MA USA

Intern (AI lab)

@ UL Solutions | Dublin, Co. Dublin, Ireland

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States