Oct. 21, 2022, 1:12 a.m. | Vaibhav Bajaj, Guni Sharon, Peter Stone

cs.LG updates on arXiv.org arxiv.org

Applying reinforcement learning (RL) to sparse reward domains is notoriously
challenging due to insufficient guiding signals. Common techniques for
addressing such domains include (1) learning from demonstrations and (2)
curriculum learning. While these two approaches have been studied in detail,
they have rarely been considered together. This paper aims to do so by
introducing a principled task phasing approach that uses demonstrations to
automatically generate a curriculum sequence. Using inverse RL from
(suboptimal) demonstrations we define a simple initial task. …

arxiv curriculum curriculum learning

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Data Scientist (Database Development)

@ Nasdaq | Bengaluru-Affluence