March 7, 2024, 5:41 a.m. | Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.03672v1 Announce Type: new
Abstract: We study online learning problems in constrained Markov decision processes (CMDPs) with adversarial losses and stochastic hard constraints. We consider two different scenarios. In the first one, we address general CMDPs, where we design an algorithm that attains sublinear regret and cumulative positive constraints violation. In the second scenario, under the mild assumption that a policy strictly satisfying the constraints exists and is known to the learner, we design an algorithm that achieves sublinear regret …

abstract adversarial algorithm arxiv constraints cs.lg decision design general losses markov online learning positive processes stochastic study type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Scientist

@ Publicis Groupe | New York City, United States

Bigdata Cloud Developer - Spark - Assistant Manager

@ State Street | Hyderabad, India