all AI news
Learning Adversarial MDPs with Stochastic Hard Constraints
March 7, 2024, 5:41 a.m. | Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti
cs.LG updates on arXiv.org arxiv.org
Abstract: We study online learning problems in constrained Markov decision processes (CMDPs) with adversarial losses and stochastic hard constraints. We consider two different scenarios. In the first one, we address general CMDPs, where we design an algorithm that attains sublinear regret and cumulative positive constraints violation. In the second scenario, under the mild assumption that a policy strictly satisfying the constraints exists and is known to the learner, we design an algorithm that achieves sublinear regret …
abstract adversarial algorithm arxiv constraints cs.lg decision design general losses markov online learning positive processes stochastic study type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Scientist
@ Publicis Groupe | New York City, United States
Bigdata Cloud Developer - Spark - Assistant Manager
@ State Street | Hyderabad, India