Q-Learning for MDPs with General Spaces: Convergence and Near Optimality via Quantization under Weak Continuity | allainews.com

Jan. 1, 2023, midnight | Ali Kara, Naci Saldi, Serdar Yüksel

JMLR www.jmlr.org

Reinforcement learning algorithms often require finiteness of state and action spaces in Markov decision processes (MDPs) (also called controlled Markov chains) and various efforts have been made in the literature towards the applicability of such algorithms for continuous state and action spaces. In this paper, we show that under very mild regularity conditions (in particular, involving only weak continuity of the transition kernel of an MDP), Q-learning for standard Borel MDPs via quantization of states and actions (called Quantized Q-Learning) …

algorithms continuous convergence decision general literature markov near paper processes q-learning quantization reinforcement reinforcement learning spaces state

More from www.jmlr.org / JMLR

Deep Network Approximation: Beyond ReLU to Diverse Activation Functions 4 months ago | www.jmlr.org

approximation beyond diverse function +10

Model-Free Representation Learning and Exploration in Low-Rank MDPs 4 months ago | www.jmlr.org

algorithms contrast dynamics exploration +9

Effect-Invariant Mechanisms for Policy Generalization 4 months ago | www.jmlr.org

adapt challenge environments exploit +7

Pygmtools: A Python Graph Matching Toolkit 4 months ago | www.jmlr.org

applications collection free graph +8

Power of knockoff: The impact of ranking algorithm, augmented design, and symmetric statistic 4 months ago | www.jmlr.org

algorithm components control design +11

Heterogeneous-Agent Reinforcement Learning 4 months ago | www.jmlr.org

agent agents ai research convergence +10

Sample-efficient Adversarial Imitation Learning 4 months ago | www.jmlr.org

advanced adversarial behavior decision +13

Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent 4 months ago | www.jmlr.org

diffusion dynamics gradient mean +4

Rates of convergence for density estimation with generative adversarial networks 4 months ago | www.jmlr.org

adversarial convergence divergence error +11

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Analyst (Digital Business Analyst)

@ Activate Interactive Pte Ltd | Singapore, Central Singapore, Singapore

View on ai-jobs.net