April 24, 2024, 4:43 a.m. | Roger Creus Castanyer, Joshua Romoff, Glen Berseth

cs.LG updates on arXiv.org arxiv.org

arXiv:2310.18144v4 Announce Type: replace
Abstract: Exploration bonuses in reinforcement learning guide long-horizon exploration by defining custom intrinsic objectives. Several exploration objectives like count-based bonuses, pseudo-counts, and state-entropy maximization are non-stationary and hence are difficult to optimize for the agent. While this issue is generally known, it is usually omitted and solutions remain under-explored. The key contribution of our work lies in transforming the original non-stationary rewards into stationary rewards through an augmented state representation. For this purpose, we introduce the …

abstract agent arxiv count cs.ai cs.lg entropy exploration guide horizon improving intrinsic issue reinforcement reinforcement learning solutions state type

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York