May 13, 2024, 4:41 a.m. | Ruixiang Sun, Hongyu Zang, Xin Li, Riashat Islam

arXiv:2405.06263v1 Announce Type: new
Abstract: Visual Model-Based Reinforcement Learning (MBRL) promises to encapsulate agent's knowledge about the underlying dynamics of the environment, enabling learning a world model as a useful planner. However, top MBRL agents such as Dreamer often struggle with visual pixel-based inputs in the presence of exogenous or irrelevant noise in the observation space, due to failure to capture task-specific features while filtering out irrelevant spatio-temporal details. To tackle this problem, we apply a spatio-temporal masking strategy, a …

