Feb. 13, 2024, 5:45 a.m. | Hongming Zhang Tongzheng Ren Chenjun Xiao Dale Schuurmans Bo Dai

cs.LG updates on arXiv.org arxiv.org

In most real-world reinforcement learning applications, state information is only partially observable, which breaks the Markov decision process assumption and leads to inferior performance for algorithms that conflate observations with state. Partially Observable Markov Decision Processes (POMDPs), on the other hand, provide a general framework that allows for partial observability to be accounted for in learning, exploration and planning, but presents significant computational and statistical challenges. To address these difficulties, we develop a representation-based perspective that leads to a coherent …

algorithms applications cs.ai cs.lg decision framework general information leads markov observability observable performance process processes reinforcement reinforcement learning state stat.ml world

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

HPC Engineer (x/f/m) - DACH

@ Meshcapade GmbH | Remote, Germany

Data Architect

@ Dyson | India - Bengaluru IT Capability Centre

GTM Operation and Marketing Data Analyst

@ DataVisor | Toronto, Ontario, Canada - Remote

Associate - Strategy & Business Intelligence

@ Hitachi | (HE)Office Rotterdam

Senior Executive - Data Analysis

@ Publicis Groupe | Beirut, Lebanon