all AI news
Off-Policy Correction For Multi-Agent Reinforcement Learning
April 4, 2024, 4:42 a.m. | Micha{\l} Zawalski, B{\l}a\.zej Osi\'nski, Henryk Michalewski, Piotr Mi{\l}o\'s
cs.LG updates on arXiv.org arxiv.org
Abstract: Multi-agent reinforcement learning (MARL) provides a framework for problems involving multiple interacting agents. Despite apparent similarity to the single-agent case, multi-agent problems are often harder to train and analyze theoretically. In this work, we propose MA-Trace, a new on-policy actor-critic algorithm, which extends V-Trace to the MARL setting. The key advantage of our algorithm is its high scalability in a multi-worker setting. To this end, MA-Trace utilizes importance sampling as an off-policy correction method, which …
abstract actor actor-critic agent agents algorithm analyze arxiv case cs.ai cs.lg cs.ma framework multi-agent multiple policy reinforcement reinforcement learning train type work
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Developer AI Senior Staff Engineer, Machine Learning
@ Google | Sunnyvale, CA, USA; New York City, USA
Engineer* Cloud & Data Operations (f/m/d)
@ SICK Sensor Intelligence | Waldkirch (bei Freiburg), DE, 79183