all AI news
SubIQ: Inverse Soft-Q Learning for Offline Imitation with Suboptimal Demonstrations
Feb. 21, 2024, 5:42 a.m. | Huy Hoang, Tien Mai, Pradeep Varakantham
cs.LG updates on arXiv.org arxiv.org
Abstract: We consider offline imitation learning (IL), which aims to mimic the expert's behavior from its demonstration without further interaction with the environment. One of the main challenges in offline IL is dealing with the limited support of expert demonstrations that cover only a small fraction of the state-action spaces. In this work, we consider offline IL, where expert demonstrations are limited but complemented by a larger set of sub-optimal demonstrations of lower expertise levels. Most …
abstract arxiv behavior challenges cs.ai cs.lg environment expert imitation learning offline small support the environment type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Data Engineer (m/f/d)
@ Project A Ventures | Berlin, Germany
Principle Research Scientist
@ Analog Devices | US, MA, Boston