Web: http://arxiv.org/abs/2110.14754

Jan. 27, 2022, 2:11 a.m. | Songyuan Zhang, Zhangjie Cao, Dorsa Sadigh, Yanan Sui

cs.LG updates on arXiv.org arxiv.org

Most existing imitation learning approaches assume the demonstrations are
drawn from experts who are optimal, but relaxing this assumption enables us to
use a wider range of data. Standard imitation learning may learn a suboptimal
policy from demonstrations with varying optimality. Prior works use confidence
scores or rankings to capture beneficial information from demonstrations with
varying optimality, but they suffer from many limitations, e.g., manually
annotated confidence scores or high average optimality of demonstrations. In
this paper, we propose a …

arxiv confidence learning

More from arxiv.org / cs.LG updates on arXiv.org

Data Scientist

@ Fluent, LLC | Boca Raton, Florida, United States

Big Data ETL Engineer

@ Binance.US | Vancouver

Data Scientist / Data Engineer

@ Kin + Carta | Chicago

Data Engineer

@ Craft | Warsaw, Masovian Voivodeship, Poland

Senior Manager, Data Analytics Audit

@ Affirm | Remote US

Data Scientist - Nationwide Opportunities, AWS Professional Services

@ Amazon.com | US, NC, Virtual Location - N Carolina