all AI news
Direct Behavior Specification via Constrained Reinforcement Learning. (arXiv:2112.12228v2 [cs.LG] UPDATED)
Jan. 21, 2022, 2:11 a.m. | Julien Roy, Roger Girgis, Joshua Romoff, Pierre-Luc Bacon, Christopher Pal
cs.LG updates on arXiv.org arxiv.org
The standard formulation of Reinforcement Learning lacks a practical way of
specifying what are admissible and forbidden behaviors. Most often,
practitioners go about the task of behavior specification by manually
engineering the reward function, a counter-intuitive process that requires
several iterations and is prone to reward hacking by the agent. In this work,
we argue that constrained RL, which has almost exclusively been used for safe
RL, also has the potential to significantly reduce the amount of work spent for …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Engineer
@ Parker | New York City
Sr. Data Analyst | Home Solutions
@ Three Ships | Raleigh or Charlotte, NC