May 20, 2022, 1:12 a.m. | Dachao Lin, Zhihua Zhang

cs.LG updates on arXiv.org arxiv.org

In this short note, we give the convergence analysis of the policy in the
recent famous policy mirror descent (PMD). We mainly consider the unregularized
setting following [11] with generalized Bregman divergence. The difference is
that we directly give the convergence rates of policy under generalized Bregman
divergence. Our results are inspired by the convergence of value function in
previous works and are an extension study of policy mirror descent. Though some
results have already appeared in previous work, we …

arxiv convergence math policy

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst (CPS-GfK)

@ GfK | Bucharest

Consultant Data Analytics IT Digital Impulse - H/F

@ Talan | Paris, France

Data Analyst

@ Experian | Mumbai, India

Data Scientist

@ Novo Nordisk | Princeton, NJ, US

Data Architect IV

@ Millennium Corporation | United States