all AI news
On-Policy Model Errors in Reinforcement Learning. (arXiv:2110.07985v2 [cs.LG] UPDATED)
March 4, 2022, 2:12 a.m. | Lukas P. Fröhlich, Maksym Lefarov, Melanie N. Zeilinger, Felix Berkenkamp
cs.LG updates on arXiv.org arxiv.org
Model-free reinforcement learning algorithms can compute policy gradients
given sampled environment transitions, but require large amounts of data. In
contrast, model-based methods can use the learned model to generate new data,
but model errors and bias can render learning unstable or suboptimal. In this
paper, we present a novel method that combines real-world data and a learned
model in order to get the best of both worlds. The core idea is to exploit the
real-world data for on-policy predictions and …
arxiv errors learning policy reinforcement reinforcement learning
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Scientist (m/f/x/d)
@ Symanto Research GmbH & Co. KG | Spain, Germany
AI Scientist/Engineer
@ OKX | Singapore
Research Engineering/ Scientist Associate I
@ The University of Texas at Austin | AUSTIN, TX
Senior Data Engineer
@ Algolia | London, England
Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)
@ BlackRock | NY7 - 50 Hudson Yards, New York
Snowflake Data Analytics
@ Devoteam | Madrid, Spain