June 30, 2022, 1:11 a.m. | Scott Fujimoto, David Meger, Doina Precup, Ofir Nachum, Shixiang Shane Gu

stat.ML updates on arXiv.org arxiv.org

In this work, we study the use of the Bellman equation as a surrogate
objective for value prediction accuracy. While the Bellman equation is uniquely
solved by the true value function over all state-action pairs, we find that the
Bellman error (the difference between both sides of the equation) is a poor
proxy for the accuracy of the value function. In particular, we show that (1)
due to cancellations from both sides of the Bellman equation, the magnitude of
the …

arxiv error lg trust value

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Enterprise AI Architect

@ Oracle | Broomfield, CO, United States

Cloud Data Engineer France H/F (CDI - Confirmé)

@ Talan | Nantes, France