all AI news
On the Estimation Bias in Double Q-Learning. (arXiv:2109.14419v3 [cs.LG] UPDATED)
Jan. 17, 2022, 2:11 a.m. | Zhizhou Ren, Guangxiang Zhu, Hao Hu, Beining Han, Jianglun Chen, Chongjie Zhang
cs.LG updates on arXiv.org arxiv.org
Double Q-learning is a classical method for reducing overestimation bias,
which is caused by taking maximum estimated values in the Bellman operation.
Its variants in the deep Q-learning paradigm have shown great promise in
producing reliable value prediction and improving learning performance.
However, as shown by prior work, double Q-learning is not fully unbiased and
suffers from underestimation bias. In this paper, we show that such
underestimation bias may lead to multiple non-optimal fixed points under an
approximate Bellman operator. …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Applied Scientist, Control Stack, AWS Center for Quantum Computing
@ Amazon.com | Pasadena, California, USA
Specialist Marketing with focus on ADAS/AD f/m/d
@ AVL | Graz, AT
Machine Learning Engineer, PhD Intern
@ Instacart | United States - Remote
Supervisor, Breast Imaging, Prostate Center, Ultrasound
@ University Health Network | Toronto, ON, Canada
Senior Manager of Data Science (Recommendation Science)
@ NBCUniversal | New York, NEW YORK, United States