Web: http://arxiv.org/abs/2105.01648

May 11, 2022, 1:11 a.m. | Marc Aurel Vischer, Robert Tjarko Lange, Henning Sprekeler

cs.LG updates on arXiv.org arxiv.org

The lottery ticket hypothesis questions the role of overparameterization in
supervised deep learning. But how is the performance of winning lottery tickets
affected by the distributional shift inherent to reinforcement learning
problems? In this work, we address this question by comparing sparse agents who
have to address the non-stationarity of the exploration-exploitation problem
with supervised agents trained to imitate an expert. We show that feed-forward
networks trained with behavioural cloning compared to reinforcement learning
can be pruned to higher levels …

arxiv deep learning on reinforcement reinforcement learning

