all AI news
IL-flOw: Imitation Learning from Observation using Normalizing Flows. (arXiv:2205.09251v1 [cs.LG])
May 20, 2022, 1:11 a.m. | Wei-Di Chang, Juan Camilo Gamboa Higuera, Scott Fujimoto, David Meger, Gregory Dudek
cs.LG updates on arXiv.org arxiv.org
We present an algorithm for Inverse Reinforcement Learning (IRL) from expert
state observations only. Our approach decouples reward modelling from policy
learning, unlike state-of-the-art adversarial methods which require updating
the reward model during policy search and are known to be unstable and
difficult to optimize. Our method, IL-flOw, recovers the expert policy by
modelling state-state transitions, by generating rewards using deep density
estimators trained on the demonstration trajectories, avoiding the instability
issues of adversarial methods. We demonstrate that using the …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Scientist (m/f/x/d)
@ Symanto Research GmbH & Co. KG | Spain, Germany
AI Scientist/Engineer
@ OKX | Singapore
Research Engineering/ Scientist Associate I
@ The University of Texas at Austin | AUSTIN, TX
Senior Data Engineer
@ Algolia | London, England
Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)
@ BlackRock | NY7 - 50 Hudson Yards, New York
Snowflake Data Analytics
@ Devoteam | Madrid, Spain