all AI news
CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity
March 26, 2024, 4:44 a.m. | Aditya Bhatt, Daniel Palenicek, Boris Belousov, Max Argus, Artemij Amiranashvili, Thomas Brox, Jan Peters
cs.LG updates on arXiv.org arxiv.org
Abstract: Sample efficiency is a crucial problem in deep reinforcement learning. Recent algorithms, such as REDQ and DroQ, found a way to improve the sample efficiency by increasing the update-to-data (UTD) ratio to 20 gradient update steps on the critic per environment sample. However, this comes at the expense of a greatly increased computational cost. To reduce this computational burden, we introduce CrossQ: A lightweight algorithm for continuous control tasks that makes careful use of Batch …
abstract algorithms arxiv cs.lg data efficiency environment found gradient normalization per reinforcement reinforcement learning sample simplicity stat.ml type update
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
C003549 Data Analyst (NS) - MON 13 May
@ EMW, Inc. | Braine-l'Alleud, Wallonia, Belgium
Marketing Decision Scientist
@ Meta | Menlo Park, CA | New York City