Jan. 1, 2023, midnight | Han Zhong, Zhuoran Yang, Zhaoran Wang, Michael I. Jordan

JMLR www.jmlr.org

We study multi-player general-sum Markov games with one of the players designated as the leader and the other players regarded as followers. In particular, we focus on the class of games where the followers are myopically rational; i.e., they aim to maximize their instantaneous rewards. For such a game, our goal is to find a Stackelberg-Nash equilibrium (SNE), which is a policy pair $(\pi^*, \nu^*)$ such that: (i) $\pi^*$ is the optimal policy for the leader when the followers always …

aim equilibria equilibrium focus game games general markov nash equilibrium policy reinforcement reinforcement learning study

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

C003549 Data Analyst (NS) - MON 13 May

@ EMW, Inc. | Braine-l'Alleud, Wallonia, Belgium

Marketing Decision Scientist

@ Meta | Menlo Park, CA | New York City