all AI news
Align Your Intents: Offline Imitation Learning via Optimal Transport
Feb. 21, 2024, 5:42 a.m. | Maksim Bobrin, Nazar Buzun, Dmitrii Krylov, Dmitry V. Dylov
cs.LG updates on arXiv.org arxiv.org
Abstract: Offline reinforcement learning (RL) addresses the problem of sequential decision-making by learning optimal policy through pre-collected data, without interacting with the environment. As yet, it has remained somewhat impractical, because one rarely knows the reward explicitly and it is hard to distill it retrospectively. Here, we show that an imitating agent can still learn the desired behavior merely from observing the expert, despite the absence of explicit rewards or action labels. In our method, AILOT …
abstract arxiv cs.ai cs.lg data decision environment imitation learning making offline policy reinforcement reinforcement learning the environment through transport type via
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Lead Data Modeler
@ Sherwin-Williams | Cleveland, OH, United States