all AI news
Robustness and risk management via distributional dynamic programming. (arXiv:2112.15430v1 [cs.LG])
Jan. 3, 2022, 2:10 a.m. | Mastane Achab, Gergely Neu
cs.LG updates on arXiv.org arxiv.org
In dynamic programming (DP) and reinforcement learning (RL), an agent learns
to act optimally in terms of expected long-term return by sequentially
interacting with its environment modeled by a Markov decision process (MDP).
More generally in distributional reinforcement learning (DRL), the focus is on
the whole distribution of the return, not just its expectation. Although
DRL-based methods produced state-of-the-art performance in RL with function
approximation, they involve additional quantities (compared to the
non-distributional setting) that are still not well understood. …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Cleared Senior Software Engineer, Computer Vision, Federal
@ CCRi | Chantilly, Virginia, United States
Data Analyst - B2C
@ DAZN | Hyderabad, India
Product Marketing Manager - AI Chatbot
@ SendBird | San Mateo, California, United States
Alternance Alternant Ingénieur Développement logiciel temps réel embarqué / computer vision (F/H)
@ Alstom | Villeurbanne, FR
AOT Data Analyst II - Highway Project Delivery
@ State of Vermont | Barre, VT, US