March 13, 2024, 4:43 a.m. | Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach

cs.LG updates on arXiv.org arxiv.org

arXiv:2401.11237v2 Announce Type: replace
Abstract: Some reinforcement learning (RL) algorithms can stitch pieces of experience to solve a task never seen before during training. This oft-sought property is one of the few ways in which RL methods based on dynamic-programming differ from RL methods based on supervised-learning (SL). Yet, certain RL methods based on off-the-shelf SL algorithms achieve excellent results without an explicit mechanism for stitching; it remains unclear whether those methods forgo this important stitching property. This paper studies …

abstract algorithms arxiv cs.lg dynamic dynamic-programming experience gap programming property reinforcement reinforcement learning solve supervised learning training type view

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne