March 13, 2024, 4:43 a.m. | Prajjwal Bhargava, Rohan Chitnis, Alborz Geramifard, Shagun Sodhani, Amy Zhang

cs.LG updates on arXiv.org arxiv.org

arXiv:2305.14550v3 Announce Type: replace
Abstract: Offline reinforcement learning (RL) allows agents to learn effective, return-maximizing policies from a static dataset. Three popular algorithms for offline RL are Conservative Q-Learning (CQL), Behavior Cloning (BC), and Decision Transformer (DT), from the class of Q-Learning, Imitation Learning, and Sequence Modeling respectively. A key open question is: which algorithm is preferred under what conditions? We study this question empirically by exploring the performance of these algorithms across the commonly used D4RL and Robomimic benchmarks. …

abstract agents algorithms arxiv behavior class cloning cs.ai cs.lg dataset decision imitation learning key learn modeling offline popular q-learning reinforcement reinforcement learning transformer transformers type

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

AI Engineering Manager

@ M47 Labs | Barcelona, Catalunya [Cataluña], Spain