March 5, 2024, 2:45 p.m. | Michael Giegrich, Christoph Reisinger, Yufei Zhang

cs.LG updates on arXiv.org arxiv.org

arXiv:2211.00617v3 Announce Type: replace-cross
Abstract: We study the global linear convergence of policy gradient (PG) methods for finite-horizon continuous-time exploratory linear-quadratic control (LQC) problems. The setting includes stochastic LQC problems with indefinite costs and allows additional entropy regularisers in the objective. We consider a continuous-time Gaussian policy whose mean is linear in the state variable and whose covariance is state-independent. Contrary to discrete-time problems, the cost is noncoercive in the policy and not all descent directions lead to bounded iterates. …

abstract arxiv continuous control convergence costs cs.lg entropy exploratory global gradient horizon linear math.oc mean policy stochastic study type

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Data Science Analyst

@ Mayo Clinic | AZ, United States

Sr. Data Scientist (Network Engineering)

@ SpaceX | Redmond, WA