Web: http://arxiv.org/abs/2201.10542

Jan. 26, 2022, 2:11 a.m. | Akshay Mete, Rahul Singh, P. R. Kumar

cs.LG updates on arXiv.org arxiv.org

We consider the problem of controlling a stochastic linear system with
quadratic costs, when its system parameters are not known to the agent --
called the adaptive LQG control problem. We re-examine an approach called
"Reward-Biased Maximum Likelihood Estimate" (RBMLE) that was proposed more than
forty years ago, and which predates the "Upper Confidence Bound" (UCB) method
as well as the definition of "regret". It simply added a term favoring
parameters with larger rewards to the estimation criterion. We propose …

arxiv math systems

More from arxiv.org / cs.LG updates on arXiv.org

Machine Learning Product Manager (Europe, Remote)

@ FreshBooks | Germany

Field Operations and Data Engineer, ADAS

@ Lucid Motors | Newark, CA

Machine Learning Engineer - Senior

@ Novetta | Reston, VA

Analytics Engineer

@ ThirdLove | Remote

Senior Machine Learning Infrastructure Engineer - Safety

@ Discord | San Francisco, CA or Remote

Internship, Data Scientist

@ Everstream Analytics | United States (Remote)