Improving RL with Lookahead: Learning Off-Policy with Online Planning | allainews.com

Jan. 8, 2022, 1:21 a.m. | Harshit Sikchi

Machine Learning Blog | ML@CMU | Carnegie Mellon University blog.ml.cmu.edu

Overview of LOOP: LOOP reduces dependency on value errors by using an H-step Lookahead Policy that plans online using learned dynamics with a terminal value function. The value function is efficiently learned by a model-free off-policy algorithm using the transitions collected in the environment when the H-step Lookahead Policy is deployed. LOOP is a desirable framework with its strong performance in Online RL, Offline RL, and Safe RL, which is shown in Locomotion, Manipulation, and Navigation environments.

learning machine learning policy reinforcement learning research rl

More from blog.ml.cmu.edu / Machine Learning Blog | ML@CMU | Carnegie Mellon University

How to Regularize Your Regression 1 month, 2 weeks ago | blog.ml.cmu.edu

application beta data datapoints +18

Beyond the Mud: Datasets, Benchmarks, and Methods for Computer Vision in Off-Road Racing 2 months, 1 week ago | blog.ml.cmu.edu

basic benchmarks beyond computer +12

Beyond the Mud: Datasets, Benchmarks, and Methods for Computer Vision in Off-Road Racing 2 months, 1 week ago | blog.ml.cmu.edu

basic benchmarks beyond computer +12

NLPositionality: Characterizing Design Biases of Datasets and Models 3 months ago | blog.ml.cmu.edu

biases dataset datasets design +11

On Noisy Evaluation in Federated Hyperparameter Tuning 5 months ago | blog.ml.cmu.edu

algorithms applications client data +12

Creative Robot Tool Use with Large Language Models 5 months, 3 weeks ago | blog.ml.cmu.edu

advanced animals constraints continuous +19

Peer Reviews of Peer Reviews: A Randomized Controlled Trial and Other Experiments 6 months ago | blog.ml.cmu.edu

kyunghyun cho machine learning peer quality +3

Supporting Human-AI Collaboration in Auditing LLMs with LLMs 8 months, 1 week ago | blog.ml.cmu.edu

ai collaboration cases chatgpt collaboration +16

Test-time Adaptation with Slot-Centric Models 8 months, 2 weeks ago | blog.ml.cmu.edu

computer vision deep learning machine learning research

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Senior Applied Data Scientist

@ dunnhumby | London

View on ai-jobs.net

Principal Data Architect - Azure & Big Data

@ MGM Resorts International | Home Office - US, NV

View on ai-jobs.net