Web: http://arxiv.org/abs/2201.08732

Jan. 24, 2022, 2:10 a.m. | Robert Müller, Aldo Pacchiano

cs.LG updates on arXiv.org arxiv.org

We study meta-learning in Markov Decision Processes (MDP) with linear
transition models in the undiscounted episodic setting. Under a task sharedness
metric based on model proximity we study task families characterized by a
distribution over models specified by a bias term and a variance component. We
then propose BUC-MatrixRL, a version of the UC-Matrix RL algorithm, and show it
can meaningfully leverage a set of sampled training tasks to quickly solve a
test task sampled from the same task distribution …

arxiv learning meta models transition

More from arxiv.org / cs.LG updates on arXiv.org

Data Analytics and Technical support Lead

@ Coupa Software, Inc. | Bogota, Colombia

Data Science Manager

@ Vectra | San Jose, CA

Data Analyst Sr

@ Capco | Brazil - Sao Paulo

Data Scientist (NLP)

@ Builder.ai | London, England, United Kingdom - Remote

Senior Data Analyst

@ BuildZoom | Scottsdale, AZ/ San Francisco, CA/ Remote

Senior Research Scientist, Speech Recognition

@ SoundHound Inc. | Toronto, Canada