Aug. 29, 2022, 1:11 a.m. | Muhammad Aneeq uz Zaman, Alec Koppel, Sujay Bhatt, Tamer Başar

cs.LG updates on arXiv.org arxiv.org

We consider online reinforcement learning in Mean-Field Games. In contrast to
the existing works, we alleviate the need for a mean-field oracle by developing
an algorithm that estimates the mean-field and the optimal policy using a
single sample path of the generic agent. We call this Sandbox Learning, as it
can be used as a warm-start for any agent operating in a multi-agent
non-cooperative setting. We adopt a two timescale approach in which an online
fixed-point recursion for the mean-field …

arxiv free games learning lg mean oracle path reinforcement reinforcement learning

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US