all AI news
Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching
April 30, 2024, 4:46 a.m. | Robert Denkert, Huy\^en Pham, Xavier Warin
stat.ML updates on arXiv.org arxiv.org
Abstract: We propose a comprehensive framework for policy gradient methods tailored to continuous time reinforcement learning. This is based on the connection between stochastic control problems and randomised problems, enabling applications across various classes of Markovian continuous time control problems, beyond diffusion models, including e.g. regular, impulse and optimal stopping/switching problems. By utilizing change of measure in the control randomisation technique, we derive a new policy gradient representation for these randomised problems, featuring parametrised intensity policies. …
abstract application applications arxiv beyond continuous control diffusion enabling framework gradient math.oc policy reinforcement reinforcement learning stat.ml stochastic type
More from arxiv.org / stat.ML updates on arXiv.org
Uniform Inference for Subsampled Moment Regression
1 day, 13 hours ago |
arxiv.org
Partial information decomposition as information bottleneck
1 day, 13 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York