A Distributional Analogue to the Successor Representation | allainews.com

Feb. 14, 2024, 5:42 a.m. | Harley Wiltzer Jesse Farebrother Arthur Gretton Yunhao Tang Andr\'e Barreto Will Dabney Marc G. Bellem

cs.LG updates on arXiv.org arxiv.org

This paper contributes a new approach for distributional reinforcement learning which elucidates a clean separation of transition structure and reward in the learning process. Analogous to how the successor representation (SR) describes the expected consequences of behaving according to a given policy, our distributional successor measure (SM) describes the distributional consequences of this behaviour. We formulate the distributional SM as a distribution over distributions and provide theory connecting it with distributional and model-based reinforcement learning. Moreover, we propose an algorithm …

consequences cs.ai cs.lg paper policy process reinforcement reinforcement learning representation stat.ml transition

More from arxiv.org / cs.LG updates on arXiv.org

RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes now | arxiv.org

Carbon Filter: Real-time Alert Triage Using Large Scale Clustering and Fast Search a second ago | arxiv.org

abstract alert alert fatigue alerts +27

Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and … a second ago | arxiv.org

abstract arxiv benchmarking challenges +22

TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation 2 seconds ago | arxiv.org

arxiv captions cs.ai cs.cv +6

FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes 3 seconds ago | arxiv.org

abstract aerial als arxiv +20

Folded context condensation in Path Integral formalism for infinite context transformers 4 seconds ago | arxiv.org

abstract algorithm arxiv attention +19

AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan Datasets 4 seconds ago | arxiv.org

abstract ai models artificial artificial intelligence +22

PoPE: Legendre Orthogonal Polynomials Based Position Encoding for Large Language Models 5 seconds ago | arxiv.org

abstract aim ape arxiv +17

Inferring Discussion Topics about Exploitation of Vulnerabilities from Underground Hacking Forums 6 seconds ago | arxiv.org

abstract arxiv cs.ai cs.cr +16

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net