all AI news
Closing the Gap between TD Learning and Supervised Learning -- A Generalisation Point of View
March 13, 2024, 4:43 a.m. | Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach
cs.LG updates on arXiv.org arxiv.org
Abstract: Some reinforcement learning (RL) algorithms can stitch pieces of experience to solve a task never seen before during training. This oft-sought property is one of the few ways in which RL methods based on dynamic-programming differ from RL methods based on supervised-learning (SL). Yet, certain RL methods based on off-the-shelf SL algorithms achieve excellent results without an explicit mechanism for stitching; it remains unclear whether those methods forgo this important stitching property. This paper studies …
abstract algorithms arxiv cs.lg dynamic dynamic-programming experience gap programming property reinforcement reinforcement learning solve supervised learning training type view
More from arxiv.org / cs.LG updates on arXiv.org
The Perception-Robustness Tradeoff in Deterministic Image Restoration
1 day, 20 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne