Feb. 19, 2024, 5:42 a.m. | Zihao Li, Boyi Liu, Zhuoran Yang, Zhaoran Wang, Mengdi Wang

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.10810v1 Announce Type: new
Abstract: We study the Constrained Convex Markov Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure, subject to a convex constraint. Designing algorithms for a constrained convex MDP faces several challenges, including (1) handling the large state space, (2) managing the exploration/exploitation tradeoff, and (3) solving the constrained optimization where the objective and the constraint are both nonlinear functions of the visitation measure. In this work, we present a …

abstract algorithms arxiv challenges cs.lg decision designing functional markov math.oc optimization policy primal process reinforcement reinforcement learning space state stat.ml study type

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US