all AI news
Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
March 13, 2024, 4:42 a.m. | Motoki Omura, Takayuki Osa, Yusuke Mukuta, Tatsuya Harada
cs.LG updates on arXiv.org arxiv.org
Abstract: In deep reinforcement learning, estimating the value function to evaluate the quality of states and actions is essential. The value function is often trained using the least squares method, which implicitly assumes a Gaussian error distribution. However, a recent study suggested that the error distribution for training the value function is often skewed because of the properties of the Bellman operator, and violates the implicit assumption of normal error distribution in the least squares method. …
abstract arxiv cs.ai cs.lg distribution error function however least online reinforcement learning q-learning quality reinforcement reinforcement learning skewness squares study type value
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst (Digital Business Analyst)
@ Activate Interactive Pte Ltd | Singapore, Central Singapore, Singapore