Web: http://arxiv.org/abs/2206.11326

June 24, 2022, 1:10 a.m. | Lucas N. Alegre, Ana L. C. Bazzan, Bruno C. da Silva

cs.LG updates on arXiv.org arxiv.org

In many real-world applications, reinforcement learning (RL) agents might
have to solve multiple tasks, each one typically modeled via a reward function.
If reward functions are expressed linearly, and the agent has previously
learned a set of policies for different tasks, successor features (SFs) can be
exploited to combine such policies and identify reasonable solutions for new
problems. However, the identified solutions are not guaranteed to be optimal.
We introduce a novel algorithm that addresses this limitation. It allows RL …

arxiv features lg linear policy support transfer

