Direct Behavior Specification via Constrained Reinforcement Learning. (arXiv:2112.12228v2 [cs.LG] UPDATED) | allainews.com

Jan. 21, 2022, 2:11 a.m. | Julien Roy, Roger Girgis, Joshua Romoff, Pierre-Luc Bacon, Christopher Pal

cs.LG updates on arXiv.org arxiv.org

The standard formulation of Reinforcement Learning lacks a practical way of
specifying what are admissible and forbidden behaviors. Most often,
practitioners go about the task of behavior specification by manually
engineering the reward function, a counter-intuitive process that requires
several iterations and is prone to reward hacking by the agent. In this work,
we argue that constrained RL, which has almost exclusively been used for safe
RL, also has the potential to significantly reduce the amount of work spent for …

arxiv learning reinforcement learning

More from arxiv.org / cs.LG updates on arXiv.org

PPNet: A Two-Stage Neural Network for End-to-end Path Planning 1 day ago | arxiv.org

abstract arxiv cs.ai cs.lg +14

Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections 1 day ago | arxiv.org

abstract arxiv cs.ai cs.dc +16

From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks 1 day ago | arxiv.org

abstract architecture arxiv context +23

DGR: Tackling Drifted and Correlated Noise in Quantum Error Correction via Decoding Graph Re-weighting 1 day ago | arxiv.org

abstract applications arxiv cs.ar +18

A Single-Loop Algorithm for Decentralized Bilevel Optimization 1 day ago | arxiv.org

abstract algorithm applications arxiv +13

Watch Out! Simple Horizontal Class Backdoors Can Trivially Evade Defenses 1 day ago | arxiv.org

abstract arxiv attacks backdoor +13

Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples 1 day ago | arxiv.org

abstract alpha arxiv cs.cr +16

CLEANing Cygnus A deep and fast with R2D2 1 day ago | arxiv.org

abstract arxiv astronomy astro-ph.im +17

Feature Imitating Networks Enhance The Performance, Reliability And Speed Of Deep Learning On Biomedical Image … 1 day ago | arxiv.org

abstract arxiv biomedical cs.cv +21

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Engineer

@ Parker | New York City

View on ai-jobs.net

Sr. Data Analyst | Home Solutions

@ Three Ships | Raleigh or Charlotte, NC

View on ai-jobs.net