all AI news
Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation. (arXiv:2209.06620v2 [cs.LG] UPDATED)
stat.ML updates on arXiv.org arxiv.org
Among the reasons hindering reinforcement learning (RL) applications to
real-world problems, two factors are critical: limited data and the mismatch
between the testing environment (real environment in which the policy is
deployed) and the training environment (e.g., a simulator). This paper attempts
to address these issues simultaneously with distributionally robust offline RL,
where we learn a distributionally robust policy using historical data obtained
from the source environment by optimizing against a worst-case perturbation
thereof. In particular, we move beyond tabular …
approximation arxiv function linear offline reinforcement reinforcement learning