Feb. 9, 2024, 5:42 a.m. | Lior Cohen Kaixin Wang Bingyi Kang Shie Mannor

cs.LG updates on arXiv.org arxiv.org

Motivated by the success of Transformers when applied to sequences of discrete symbols, token-based world models (TBWMs) were recently proposed as sample-efficient methods. In TBWMs, the world model consumes agent experience as a language-like sequence of tokens, where each observation constitutes a sub-sequence. However, during imagination, the sequential token-by-token generation of next observations results in a severe bottleneck, leading to long training times, poor GPU utilization, and limited representations. To resolve this bottleneck, we devise a novel Parallel Observation Prediction …

cs.ai cs.lg

