all AI news
Tactical Optimism and Pessimism for Deep Reinforcement Learning. (arXiv:2102.03765v4 [cs.LG] UPDATED)
Jan. 17, 2022, 2:10 a.m. | Ted Moskovitz, Jack Parker-Holder, Aldo Pacchiano, Michael Arbel, Michael I. Jordan
cs.LG updates on arXiv.org arxiv.org
In recent years, deep off-policy actor-critic algorithms have become a
dominant approach to reinforcement learning for continuous control. One of the
primary drivers of this improved performance is the use of pessimistic value
updates to address function approximation errors, which previously led to
disappointing performance. However, a direct consequence of pessimism is
reduced exploration, running counter to theoretical support for the efficacy of
optimism in the face of uncertainty. So which approach is best? In this work,
we show that …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Marketing Data Analyst
@ Amazon.com | Amsterdam, North Holland, NLD
Senior Data Analyst
@ MoneyLion | Kuala Lumpur, Kuala Lumpur, Malaysia
Data Management Specialist - Office of the CDO - Chase- Associate
@ JPMorgan Chase & Co. | LONDON, LONDON, United Kingdom
BI Data Analyst
@ Nedbank | Johannesburg, ZA
Head of Data Science and Artificial Intelligence (m/f/d)
@ Project A Ventures | Munich, Germany
Senior Data Scientist - GenAI
@ Roche | Hyderabad RSS