Assessing the Impact of Distribution Shift on Reinforcement Learning Performance | allainews.com

Feb. 7, 2024, 5:42 a.m. | Ted Fujimoto Joshua Suetterlein Samrat Chatterjee Auroop Ganguly

cs.LG updates on arXiv.org arxiv.org

Research in machine learning is making progress in fixing its own reproducibility crisis. Reinforcement learning (RL), in particular, faces its own set of unique challenges. Comparison of point estimates, and plots that show successful convergence to the optimal policy during training, may obfuscate overfitting or dependence on the experimental setup. Although researchers in RL have proposed reliability metrics that account for uncertainty to better understand each algorithm's strengths and weaknesses, the recommendations of past work do not assume the presence …

challenges comparison convergence crisis cs.ai cs.lg cs.ma distribution experimental impact machine machine learning making overfitting performance plots policy progress reinforcement reinforcement learning reproducibility research set setup shift show training

More from arxiv.org / cs.LG updates on arXiv.org

Provably Stable Feature Rankings with SHAP and LIME 7 hours ago | arxiv.org

abstract arxiv attribution cs.lg +24

Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for Model-free LQR 7 hours ago | arxiv.org

abstract arxiv cs.lg finn +14

DITTO: Diffusion Inference-Time T-Optimization for Music Generation 7 hours ago | arxiv.org

arxiv cs.ai cs.lg cs.sd +9

Provably Scalable Black-Box Variational Inference with Structured Variational Families 7 hours ago | arxiv.org

abstract arxiv box complexity +15

Robotic Imitation of Human Actions 7 hours ago | arxiv.org

abstract arxiv challenges cs.lg +10

Consistency of semi-supervised learning, stochastic tug-of-war games, and the p-Laplacian 7 hours ago | arxiv.org

abstract arxiv cs.lg cs.na +22

Quantum Generative Diffusion Model: A Fully Quantum-Mechanical Model for Generating Quantum State Ensemble 7 hours ago | arxiv.org

abstract advance arxiv cs.lg +13

E$^{2}$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation 7 hours ago | arxiv.org

abstract adversarial arxiv commercial +30

Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search 7 hours ago | arxiv.org

arxiv code cs.cl cs.ir +10

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Data Analyst (Salesforce)

@ Lisinski Law Firm | Latin America

View on ai-jobs.net

Data Analyst

@ Fusemachines | India - Remote

View on ai-jobs.net