Web: http://arxiv.org/abs/2209.06620

Sept. 30, 2022, 1:14 a.m. | Xiaoteng Ma, Zhipeng Liang, Jose Blanchet, Mingwen Liu, Li Xia, Jiheng Zhang, Qianchuan Zhao, Zhengyuan Zhou

stat.ML updates on arXiv.org arxiv.org

Among the reasons hindering reinforcement learning (RL) applications to
real-world problems, two factors are critical: limited data and the mismatch
between the testing environment (real environment in which the policy is
deployed) and the training environment (e.g., a simulator). This paper attempts
to address these issues simultaneously with distributionally robust offline RL,
where we learn a distributionally robust policy using historical data obtained
from the source environment by optimizing against a worst-case perturbation
thereof. In particular, we move beyond tabular …

approximation arxiv function linear offline reinforcement reinforcement learning

More from arxiv.org / stat.ML updates on arXiv.org


@ METRO/MAKRO | Nanterre, France

Data Analyst

@ Netcentric | Barcelona, Spain

Power BI Developer

@ Lendi Group | Sydney, Australia

Staff Data Scientist - Merchant Services (Remote, North America)

@ Shopify | Dallas, TX, United States

Machine Learning / Data Engineer

@ WATI | Vietnam - Remote

F/H Data Manager

@ Bosch Group | Saint-Ouen-sur-Seine, France

[Fixed-term contract until July 2023] Data Quality Controller - Space Industry Luxembourg (m/f/o)

@ LuxSpace Sarl | Betzdorf, Luxembourg

Senior Data Engineer (Azure DataBricks/datalake)

@ SpectraMedix | East Windsor, NJ, United States

Abschlussarbeit im Bereich Data Analytics (w/m/div.)

@ Bosch Group | Rülzheim, Germany

Data Engineer - Marketing

@ Publicis Groupe | London, United Kingdom

Data Engineer (Consulting division)

@ Starschema | Budapest, Hungary

Team Leader, Master Data Management - Support CN, HK & TW

@ Publicis Groupe | Kuala Lumpur, Malaysia