Jan. 26, 2022, 2:11 a.m. | Akshay Mete, Rahul Singh, P. R. Kumar

cs.LG updates on arXiv.org arxiv.org

We consider the problem of controlling a stochastic linear system with
quadratic costs, when its system parameters are not known to the agent --
called the adaptive LQG control problem. We re-examine an approach called
"Reward-Biased Maximum Likelihood Estimate" (RBMLE) that was proposed more than
forty years ago, and which predates the "Upper Confidence Bound" (UCB) method
as well as the definition of "regret". It simply added a term favoring
parameters with larger rewards to the estimation criterion. We propose …

arxiv math systems

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Applied Scientist, Control Stack, AWS Center for Quantum Computing

@ Amazon.com | Pasadena, California, USA

Specialist Marketing with focus on ADAS/AD f/m/d

@ AVL | Graz, AT

Machine Learning Engineer, PhD Intern

@ Instacart | United States - Remote

Supervisor, Breast Imaging, Prostate Center, Ultrasound

@ University Health Network | Toronto, ON, Canada

Senior Manager of Data Science (Recommendation Science)

@ NBCUniversal | New York, NEW YORK, United States