Web: http://arxiv.org/abs/2205.01965

May 5, 2022, 1:12 a.m. | Lorenzo Steccanella, Anders Jonsson

cs.LG updates on arXiv.org arxiv.org

This paper presents a novel state representation for reward-free Markov
decision processes. The idea is to learn, in a self-supervised manner, an
embedding space where distances between pairs of embedded states correspond to
the minimum number of actions needed to transition between them. Compared to
previous methods, our approach does not require any domain knowledge, learning
from offline and unlabeled data. We show how this representation can be
leveraged to learn goal-conditioned policies, providing a notion of similarity
between states …

arxiv learning reinforcement reinforcement learning representation representation learning state

More from arxiv.org / cs.LG updates on arXiv.org

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC

Senior Data Science Writer

@ NannyML | Remote