all AI news
Counterfactuals for Reinforcement Learning II: Improving Reward Learning
Jan. 15, 2022, 9:54 p.m. | Felix Hofstätter
Towards Data Science - Medium towardsdatascience.com
Safer reward function learning using counterfactuals
In the previous part of this series, I introduced counterfactuals and showed how to encode them in the POMDP framework. In this part, I will focus on how counterfactuals can be applied in the emerging field of Reward Learning. The article will first give a brief summary of the basic elements of Reward Learning. Using a running example, I will then demonstrate how Reward Learning can fail to produce the desired outcome. Ultimately, …
ai-alignment-and-safety ai-safety artificial intelligence editors pick ii learning reinforcement learning
More from towardsdatascience.com / Towards Data Science - Medium
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst
@ SEAKR Engineering | Englewood, CO, United States
Data Analyst II
@ Postman | Bengaluru, India
Data Architect
@ FORSEVEN | Warwick, GB
Director, Data Science
@ Visa | Washington, DC, United States
Senior Manager, Data Science - Emerging ML
@ Capital One | McLean, VA