Web: http://arxiv.org/abs/2201.09635

Jan. 31, 2022, 2:11 a.m. | Vivienne Huiling Wang, Joni Pajarinen, Tinghuai Wang, Joni Kämäräinen

cs.LG updates on arXiv.org arxiv.org

Hierarchical reinforcement learning (HRL) proposes to solve difficult tasks
by performing decision-making and control at successively higher levels of
temporal abstraction. However, off-policy HRL often suffers from the problem of
non-stationary high-level policy since the low-level policy is constantly
changing. In this paper, we propose a novel HRL approach for mitigating the
non-stationarity by adversarially enforcing the high-level policy to generate
subgoals compatible with the current instantiation of the low-level policy. In
practice, the adversarial learning is implemented by training …

arxiv learning reinforcement learning

More from arxiv.org / cs.LG updates on arXiv.org

Director, Data Science (Advocacy & Nonprofit)

@ Civis Analytics | Remote

Data Engineer

@ Rappi | [CO] Bogotá

Data Scientist V, Marketplaces Personalization (Remote)

@ ID.me | United States (U.S.)

Product OPs Data Analyst (Flex/Remote)

@ Scaleway | Paris

Big Data Engineer

@ Risk Focus | Riga, Riga, Latvia

Internship Program: Machine Learning Backend

@ Nextail | Remote job