all AI news
Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems
March 5, 2024, 2:45 p.m. | Michael Giegrich, Christoph Reisinger, Yufei Zhang
cs.LG updates on arXiv.org arxiv.org
Abstract: We study the global linear convergence of policy gradient (PG) methods for finite-horizon continuous-time exploratory linear-quadratic control (LQC) problems. The setting includes stochastic LQC problems with indefinite costs and allows additional entropy regularisers in the objective. We consider a continuous-time Gaussian policy whose mean is linear in the state variable and whose covariance is state-independent. Contrary to discrete-time problems, the cost is noncoercive in the policy and not all descent directions lead to bounded iterates. …
abstract arxiv continuous control convergence costs cs.lg entropy exploratory global gradient horizon linear math.oc mean policy stochastic study type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer
@ GPTZero | Toronto, Canada
ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)
@ HelloBetter | Remote
Doctoral Researcher (m/f/div) in Automated Processing of Bioimages
@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Principal Data Architect - Azure & Big Data
@ MGM Resorts International | Home Office - US, NV
GN SONG MT Market Research Data Analyst 11
@ Accenture | Bengaluru, BDC7A