all AI news
An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient. (arXiv:2307.08873v2 [cs.LG] UPDATED)
cs.LG updates on arXiv.org arxiv.org
Restricting the variance of a policy's return is a popular choice in
risk-averse Reinforcement Learning (RL) due to its clear mathematical
definition and easy interpretability. Traditional methods directly restrict the
total return variance. Recent methods restrict the per-step reward variance as
a proxy. We thoroughly examine the limitations of these variance-based methods,
such as sensitivity to numerical scale and hindering of policy learning, and
propose to use an alternative risk measure, Gini deviation, as a substitute. We
study various properties …
arxiv clear definition deviation easy gradient interpretability per policy popular reinforcement reinforcement learning risk variance