all AI news
Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble. (arXiv:2206.00238v1 [cs.LG])
June 2, 2022, 1:10 a.m. | Fan-Ming Luo, Xingchen Cao, Yang Yu
cs.LG updates on arXiv.org arxiv.org
Inverse reinforcement learning (IRL) recovers the underlying reward function
from expert demonstrations. A generalizable reward function is even desired as
it captures the fundamental motivation of the expert. However, classical IRL
methods can only recover reward functions coupled with the training dynamics,
thus are hard to generalize to a changed environment. Previous
dynamics-agnostic reward learning methods have strict assumptions, such as that
the reward function has to be state-only. This work proposes a general approach
to learn transferable reward functions, …
More from arxiv.org / cs.LG updates on arXiv.org
A Single-Loop Algorithm for Decentralized Bilevel Optimization
1 day, 7 hours ago |
arxiv.org
CLEANing Cygnus A deep and fast with R2D2
1 day, 7 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Management Associate
@ EcoVadis | Ebène, Mauritius
Senior Data Engineer
@ Telstra | Telstra ICC Bengaluru