Learning Adversarial MDPs with Stochastic Hard Constraints | allainews.com

March 7, 2024, 5:41 a.m. | Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.03672v1 Announce Type: new
Abstract: We study online learning problems in constrained Markov decision processes (CMDPs) with adversarial losses and stochastic hard constraints. We consider two different scenarios. In the first one, we address general CMDPs, where we design an algorithm that attains sublinear regret and cumulative positive constraints violation. In the second scenario, under the mild assumption that a policy strictly satisfying the constraints exists and is known to the learner, we design an algorithm that achieves sublinear regret …

abstract adversarial algorithm arxiv constraints cs.lg decision design general losses markov online learning positive processes stochastic study type

More from arxiv.org / cs.LG updates on arXiv.org

APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference 12 hours ago | arxiv.org

abstract arxiv cs.cl cs.lg +15

Brain-Inspired Spiking Neural Networks for Industrial Fault Diagnosis: A Survey, Challenges, and Opportunities 12 hours ago | arxiv.org

abstract arxiv brain brain-inspired +21

Data-driven Energy Efficiency Modelling in Large-scale Networks: An Expert Knowledge and ML-based Approach 12 hours ago | arxiv.org

abstract arxiv challenge complexity +23

Learned Regularization for Inverse Problems: Insights from a Spectral Model 12 hours ago | arxiv.org

abstract art arxiv convergence +14

LLMs cannot find reasoning errors, but can correct them given the error location 12 hours ago | arxiv.org

abstract arxiv become chen +17

Conditional Denoising Diffusion Probabilistic Models for Data Reconstruction Enhancement in Wireless Communications 12 hours ago | arxiv.org

abstract arxiv channels communications +17

Deep ReLU networks and high-order finite element methods II: Chebyshev emulation 12 hours ago | arxiv.org

abstract arxiv continuous cs.lg +17

Robust Energy Consumption Prediction with a Missing Value-Resilient Metaheuristic-based Neural Network in Mobile App Development 12 hours ago | arxiv.org

abstract app application arxiv +21

On Universally Optimal Algorithms for A/B Testing 12 hours ago | arxiv.org

abstract a/b testing algorithm algorithms +17

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Real World Evidence Research Analyst

@ Novartis | Dublin (Novartis Global Service Center (NGSC))

View on ai-jobs.net

Senior DataOps Engineer

@ Winterthur Gas & Diesel AG | Winterthur, CH

View on ai-jobs.net