all AI news
Learning Stationary Nash Equilibrium Policies in $n$-Player Stochastic Games with Independent Chains via Dual Mirror Descent. (arXiv:2201.12224v3 [cs.LG] UPDATED)
cs.LG updates on arXiv.org arxiv.org
We consider a subclass of $n$-player stochastic games, in which players have
their own internal state/action spaces while they are coupled through their
payoff functions. It is assumed that players' internal chains are driven by
independent transition probabilities. Moreover, players can receive only
realizations of their payoffs, not the actual functions, and cannot observe
each other's states/actions. Under some assumptions on the structure of the
payoff functions, we develop efficient learning algorithms based on dual
averaging and dual mirror descent, …
arxiv equilibrium games independent learning lg nash equilibrium stochastic