Aug. 19, 2022, 1:10 a.m. | Xuyang Chen, Jingliang Duan, Yingbin Liang, Lin Zhao

cs.LG updates on arXiv.org arxiv.org

The actor-critic (AC) reinforcement learning algorithms have been the
powerhouse behind many challenging applications. Nevertheless, its convergence
is fragile in general. To study its instability, existing works mostly consider
the uncommon double-loop variant or basic models with finite state and action
space. We investigate the more practical single-sample two-timescale AC for
solving the canonical linear quadratic regulator (LQR) problem, where the actor
and the critic update only once with a single sample in each iteration on an
unbounded continuous state …

actor-critic arxiv convergence global lg linear regulator

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Tableau/PowerBI Developer (A.Con)

@ KPMG India | Bengaluru, Karnataka, India

Software Engineer, Backend - Data Platform (Big Data Infra)

@ Benchling | San Francisco, CA