Jan. 10, 2022, 2:10 a.m. | DJ Strouse, Kevin R. McKee, Matt Botvinick, Edward Hughes, Richard Everett

cs.LG updates on arXiv.org arxiv.org

Collaborating with humans requires rapidly adapting to their individual
strengths, weaknesses, and preferences. Unfortunately, most standard
multi-agent reinforcement learning techniques, such as self-play (SP) or
population play (PP), produce agents that overfit to their training partners
and do not generalize well to humans. Alternatively, researchers can collect
human data, train a human model using behavioral cloning, and then use that
model to train "human-aware" agents ("behavioral cloning play", or BCP). While
such an approach can improve the generalization of agents …

arxiv data human humans

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Data Analyst, Tableau

@ NTT DATA | Bengaluru, KA, IN

Junior Machine Learning Researcher

@ Weill Cornell Medicine | Doha, QA, 24144

Marketing Data Analytics Intern

@ Sloan | Franklin Park, IL, US, 60131

Senior Machine Learning Scientist

@ Adyen | Amsterdam

Data Engineer

@ Craft.co | Warsaw, Mazowieckie