all AI news
Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
April 12, 2024, 4:43 a.m. | Nate Rahn, Pierluca D'Oro, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare
cs.LG updates on arXiv.org arxiv.org
Abstract: Deep reinforcement learning agents for continuous control are known to exhibit significant instability in their performance over time. In this work, we provide a fresh perspective on these behaviors by studying the return landscape: the mapping between a policy and a return. We find that popular algorithms traverse noisy neighborhoods of this landscape, in which a single update to the policy parameters leads to a wide range of returns. By taking a distributional view of …
abstract agents arxiv continuous control cs.lg landscape mapping optimization performance perspective policy reinforcement reinforcement learning studying type work
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer
@ GPTZero | Toronto, Canada
ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)
@ HelloBetter | Remote
Doctoral Researcher (m/f/div) in Automated Processing of Bioimages
@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Senior Applied Data Scientist
@ dunnhumby | London
Principal Data Architect - Azure & Big Data
@ MGM Resorts International | Home Office - US, NV