Jan. 5, 2022, 2:10 a.m. | Jens Tuyls, Shunyu Yao, Sham Kakade, Karthik Narasimhan

cs.CL updates on arXiv.org arxiv.org

Text adventure games present unique challenges to reinforcement learning
methods due to their combinatorially large action spaces and sparse rewards.
The interplay of these two factors is particularly demanding because large
action spaces require extensive exploration, while sparse rewards provide
limited feedback. This work proposes to tackle the explore-vs-exploit dilemma
using a multi-stage approach that explicitly disentangles these two strategies
within each episode. Our algorithm, called eXploit-Then-eXplore (XTX), begins
each episode using an exploitation policy that imitates a set of …

arxiv exploration games stage text

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Data Science Sustainability Co-Op (Summer & Fall 2024)

@ O-I | Perrysburg, OH, United States

Research Scientist

@ Chevron Phillips Chemical Company | USA: Kingwood, TX, US, 77339

Data Scientist Python (Django) (m/f/d)

@ RoomPriceGenie | Hybrid Mannheim, Remote DACH, Remote Germany

Operational Transformation & Strategy - Data Operations - Associate

@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India

Senior Data Scientist

@ Rocket Travel | Chicago, IL USA