Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes | allainews.com

May 1, 2024, 4:43 a.m. | Lakshmi Mandal, Chandrashekar Lakshminarayanan, Shalabh Bhatnagar

cs.LG updates on arXiv.org arxiv.org

arXiv:2311.11789v2 Announce Type: replace
Abstract: In this work, we consider a cooperative multi-agent Markov decision process (MDP) involving m agents. At each decision epoch, all the m agents independently select actions in order to maximize a common long-term objective. In the policy iteration process of multi-agent setup, the number of actions grows exponentially with the number of agents, incurring huge computational costs. Thus, recent works consider decentralized policy improvement, where each agent improves its decisions unilaterally, assuming that the decisions …

abstract agent agents arxiv cs.lg cs.ma decentralized decision iteration linear long-term markov math.oc multi-agent policy process processes programming setup type work

More from arxiv.org / cs.LG updates on arXiv.org

Marabou 2.0: A Versatile Formal Analyzer of Neural Networks 20 hours ago | arxiv.org

abstract analysis arxiv components +16

Metric Entropy-Free Sample Complexity Bounds for Sample Average Approximation in Convex Stochastic Programming 20 hours ago | arxiv.org

abstract approximation arxiv complexity +15

FengWu-4DVar: Coupling the Data-driven Weather Forecasting Model with 4D Variational Assimilation 20 hours ago | arxiv.org

abstract artificial artificial intelligence arxiv +16

Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge 20 hours ago | arxiv.org

arxiv bridge cs.ai cs.cv +8

Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models 20 hours ago | arxiv.org

arxiv cs.cl cs.lg incremental +7

System-level Safety Guard: Safe Tracking Control through Uncertain Neural Network Dynamics Models 20 hours ago | arxiv.org

arxiv control cs.lg cs.ro +13

Structured state-space models are deep Wiener models 20 hours ago | arxiv.org

abstract arxiv become classification +16

Differentiable and accelerated spherical harmonic and Wigner transforms 20 hours ago | arxiv.org

abstract analysis and analysis arxiv +16

Stable Attractors for Neural networks classification via Ordinary Differential Equations (SA-nODE) 20 hours ago | arxiv.org

abstract arxiv classification cond-mat.dis-nn +18

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

.NET Software Engineer (AI Focus)

@ Boskalis | Papendrecht, Netherlands

View on ai-jobs.net