Web: http://arxiv.org/abs/2201.08610

Jan. 24, 2022, 2:10 a.m. | Balázs Varga, Balázs Kulcsár, Morteza Haghir Chehreghani

cs.LG updates on arXiv.org arxiv.org

In this paper, we place deep Q-learning into a control-oriented perspective
and study its learning dynamics with well-established techniques from robust
control. We formulate an uncertain linear time-invariant model by means of the
neural tangent kernel to describe learning. We show the instability of learning
and analyze the agent's behavior in frequency-domain. Then, we ensure
convergence via robust controllers acting as dynamical rewards in the loss
function. We synthesize three controllers: state-feedback gain scheduling
$\mathcal{H}_2$, dynamic $\mathcal{H}_\infty$, and constant gain …

arxiv deep learning q-learning

More from arxiv.org / cs.LG updates on arXiv.org

Machine Learning Product Manager (Europe, Remote)

@ FreshBooks | Germany

Field Operations and Data Engineer, ADAS

@ Lucid Motors | Newark, CA

Machine Learning Engineer - Senior

@ Novetta | Reston, VA

Analytics Engineer

@ ThirdLove | Remote

Senior Machine Learning Infrastructure Engineer - Safety

@ Discord | San Francisco, CA or Remote

Internship, Data Scientist

@ Everstream Analytics | United States (Remote)