all AI news
Optimal scheduling of entropy regulariser for continuous-time linear-quadratic reinforcement learning. (arXiv:2208.04466v1 [cs.LG])
stat.ML updates on arXiv.org arxiv.org
This work uses the entropy-regularised relaxed stochastic control perspective
as a principled framework for designing reinforcement learning (RL) algorithms.
Herein agent interacts with the environment by generating noisy controls
distributed according to the optimal relaxed policy. The noisy policies on the
one hand, explore the space and hence facilitate learning but, on the other
hand, introduce bias by assigning a positive probability to non-optimal
actions. This exploration-exploitation trade-off is determined by the strength
of entropy regularisation. We study algorithms resulting …
arxiv continuous entropy learning lg linear reinforcement reinforcement learning scheduling time