March 7, 2024, 5:41 a.m. | Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.03672v1 Announce Type: new
Abstract: We study online learning problems in constrained Markov decision processes (CMDPs) with adversarial losses and stochastic hard constraints. We consider two different scenarios. In the first one, we address general CMDPs, where we design an algorithm that attains sublinear regret and cumulative positive constraints violation. In the second scenario, under the mild assumption that a policy strictly satisfying the constraints exists and is known to the learner, we design an algorithm that achieves sublinear regret …

abstract adversarial algorithm arxiv constraints cs.lg decision design general losses markov online learning positive processes stochastic study type

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Real World Evidence Research Analyst

@ Novartis | Dublin (Novartis Global Service Center (NGSC))

Senior DataOps Engineer

@ Winterthur Gas & Diesel AG | Winterthur, CH