March 27, 2024, 4:41 a.m. | Zeyu Jia, Alexander Rakhlin, Ayush Sekhari, Chen-Yu Wei

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.17091v1 Announce Type: new
Abstract: We revisit the problem of offline reinforcement learning with value function realizability but without Bellman completeness. Previous work by Xie and Jiang (2021) and Foster et al. (2022) left open the question whether a bounded concentrability coefficient along with trajectory-based offline data admits a polynomial sample complexity. In this work, we provide a negative answer to this question for the task of offline policy evaluation. In addition to addressing this question, we provide a rather …

abstract aggregation arxiv cs.ai cs.lg data function offline polynomial question reinforcement reinforcement learning role state stat.ml trajectory type value work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

AIML - Sr Machine Learning Engineer, Data and ML Innovation

@ Apple | Seattle, WA, United States

Senior Data Engineer

@ Palta | Palta Cyprus, Palta Warsaw, Palta remote