all AI news
Learning Adversarial MDPs with Stochastic Hard Constraints
March 7, 2024, 5:41 a.m. | Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti
cs.LG updates on arXiv.org arxiv.org
Abstract: We study online learning problems in constrained Markov decision processes (CMDPs) with adversarial losses and stochastic hard constraints. We consider two different scenarios. In the first one, we address general CMDPs, where we design an algorithm that attains sublinear regret and cumulative positive constraints violation. In the second scenario, under the mild assumption that a policy strictly satisfying the constraints exists and is known to the learner, we design an algorithm that achieves sublinear regret …
abstract adversarial algorithm arxiv constraints cs.lg decision design general losses markov online learning positive processes stochastic study type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer
@ GPTZero | Toronto, Canada
ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)
@ HelloBetter | Remote
Doctoral Researcher (m/f/div) in Automated Processing of Bioimages
@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Real World Evidence Research Analyst
@ Novartis | Dublin (Novartis Global Service Center (NGSC))
Senior DataOps Engineer
@ Winterthur Gas & Diesel AG | Winterthur, CH