April 24, 2024, 4:43 a.m. | Roger Creus Castanyer, Joshua Romoff, Glen Berseth

cs.LG updates on arXiv.org arxiv.org

arXiv:2310.18144v4 Announce Type: replace
Abstract: Exploration bonuses in reinforcement learning guide long-horizon exploration by defining custom intrinsic objectives. Several exploration objectives like count-based bonuses, pseudo-counts, and state-entropy maximization are non-stationary and hence are difficult to optimize for the agent. While this issue is generally known, it is usually omitted and solutions remain under-explored. The key contribution of our work lies in transforming the original non-stationary rewards into stationary rewards through an augmented state representation. For this purpose, we introduce the …

abstract agent arxiv count cs.ai cs.lg entropy exploration guide horizon improving intrinsic issue reinforcement reinforcement learning solutions state type

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne