Feb. 27, 2024, 5:43 a.m. | Nico Messikommer, Yunlong Song, Davide Scaramuzza

cs.LG updates on arXiv.org arxiv.org

arXiv:2309.09752v3 Announce Type: replace
Abstract: In Reinforcement Learning, the trade-off between exploration and exploitation poses a complex challenge for achieving efficient learning from limited samples. While recent works have been effective in leveraging past experiences for policy updates, they often overlook the potential of reusing past experiences for data collection. Independent of the underlying RL algorithm, we introduce the concept of a Contrastive Initial State Buffer, which strategically selects states from past experiences and uses them to initialize the agent …

abstract arxiv challenge collection cs.lg data data collection exploitation exploration independent policy reinforcement reinforcement learning samples state trade trade-off type updates

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne