Aug. 22, 2022, 1:10 a.m. | Jared Markowitz, Ryan W. Gardner, Ashley Llorens, Raman Arora, I-Jeng Wang

cs.LG updates on arXiv.org arxiv.org

Standard deep reinforcement learning (DRL) aims to maximize expected reward,
considering collected experiences equally in formulating a policy. This differs
from human decision-making, where gains and losses are valued differently and
outlying outcomes are given increased consideration. It also fails to
capitalize on opportunities to improve safety and/or performance through the
incorporation of distributional context. Several approaches to distributional
DRL have been investigated, with one popular strategy being to evaluate the
projected distribution of returns for possible actions. We propose …

arxiv lg optimization policy risk

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Lead Data Engineer

@ JPMorgan Chase & Co. | Jersey City, NJ, United States

Senior Machine Learning Engineer

@ TELUS | Vancouver, BC, CA

CT Technologist - Ambulatory Imaging - PRN

@ Duke University | Morriville, NC, US, 27560

BH Data Analyst

@ City of Philadelphia | Philadelphia, PA, United States