Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs | allainews.com

March 12, 2024, 4:44 a.m. | Ian A. Kash, Lev Reyzin, Zishun Yu

cs.LG updates on arXiv.org arxiv.org

arXiv:2205.09056v3 Announce Type: replace
Abstract: Reinforcement learning generalizes multi-armed bandit problems with additional difficulties of a longer planning horizon and unknown transition kernel. We explore a black-box reduction from discounted infinite-horizon tabular reinforcement learning to multi-armed bandits, where, specifically, an independent bandit learner is placed in each state. We show that, under ergodicity and fast mixing assumptions, any slowly changing adversarial bandit algorithm achieving optimal regret in the adversarial bandit setting can also attain optimal expected regret in infinite-horizon discounted …

abstract adversarial algorithms arxiv box cs.lg explore horizon independent kernel multi-armed bandits planning reinforcement reinforcement learning show state tabular transition type

More from arxiv.org / cs.LG updates on arXiv.org

Learning epidemic trajectories through Kernel Operator Learning: from modelling to optimal control 14 hours ago | arxiv.org

abstract architectures arxiv control +16

RTA-Former: Reverse Transformer Attention for Polyp Segmentation 14 hours ago | arxiv.org

abstract architectures arxiv attention +21

Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data 14 hours ago | arxiv.org

abstract applications arxiv capacity +22

AdaMR: Adaptable Molecular Representation for Unified Pre-training Strategy 14 hours ago | arxiv.org

abstract arxiv cs.ai cs.lg +14

L3Cube-IndicNews: News-based Short Text and Long Document Classification Datasets in Indic Languages 14 hours ago | arxiv.org

arxiv classification cs.cl cs.lg +5

Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis … 14 hours ago | arxiv.org

abstract architecture arxiv benchmarking +18

Agglomerative Federated Learning: Empowering Larger Model Training via End-Edge-Cloud Collaboration 14 hours ago | arxiv.org

abstract artificial artificial intelligence arxiv +20

Describing Differences in Image Sets with Natural Language 14 hours ago | arxiv.org

abstract arxiv cs.cl cs.cv +16

ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts 14 hours ago | arxiv.org

abstract arxiv challenge cs.ai +24

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Engineer - New Graduate

@ Applied Materials | Milan,ITA

View on ai-jobs.net

Lead Machine Learning Scientist

@ Biogen | Cambridge, MA, United States

View on ai-jobs.net