March 19, 2024, 4:44 a.m. | Kaustubh Sridhar, Souradeep Dutta, Dinesh Jayaraman, James Weimer, Insup Lee

cs.LG updates on arXiv.org arxiv.org

arXiv:2310.06171v2 Announce Type: replace
Abstract: Imitation learning considerably simplifies policy synthesis compared to alternative approaches by exploiting access to expert demonstrations. For such imitation policies, errors away from the training samples are particularly critical. Even rare slip-ups in the policy action outputs can compound quickly over time, since they lead to unfamiliar future states where the policy is still more likely to err, eventually causing task failures. We revisit simple supervised ``behavior cloning'' for conveniently training the policy from nothing …

arxiv consistent cs.ai cs.lg cs.ro imitation learning memory networks neural networks type

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Lead Data Scientist, Commercial Analytics

@ Checkout.com | London, United Kingdom

Data Engineer I

@ Love's Travel Stops | Oklahoma City, OK, US, 73120