all AI news
Continuous MDP Homomorphisms and Homomorphic Policy Gradient. (arXiv:2209.07364v1 [cs.LG])
Sept. 16, 2022, 1:12 a.m. | Sahand Rezaei-Shoshtari, Rosie Zhao, Prakash Panangaden, David Meger, Doina Precup
cs.LG updates on arXiv.org arxiv.org
Abstraction has been widely studied as a way to improve the efficiency and
generalization of reinforcement learning algorithms. In this paper, we study
abstraction in the continuous-control setting. We extend the definition of MDP
homomorphisms to encompass continuous actions in continuous state spaces. We
derive a policy gradient theorem on the abstract MDP, which allows us to
leverage approximate symmetries of the environment for policy optimization.
Based on this theorem, we propose an actor-critic algorithm that is able to
learn …
More from arxiv.org / cs.LG updates on arXiv.org
Regularization by Texts for Latent Diffusion Inverse Solvers
1 day, 10 hours ago |
arxiv.org
When can transformers reason with abstract symbols?
1 day, 10 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Scientist (m/f/x/d)
@ Symanto Research GmbH & Co. KG | Spain, Germany
Automated Greenhouse Expert - Phenotyping & Data Analysis (all genders)
@ Bayer | Frankfurt a.M., Hessen, DE
Machine Learning Scientist II
@ Expedia Group | India - Bengaluru
Data Engineer/Senior Data Engineer, Bioinformatics
@ Flagship Pioneering, Inc. | Cambridge, MA USA
Intern (AI lab)
@ UL Solutions | Dublin, Co. Dublin, Ireland
Senior Operations Research Analyst / Predictive Modeler
@ LinQuest | Colorado Springs, Colorado, United States