May 3, 2024, 4:54 a.m. | Wanpeng Zhang, Xi Xiao, Yao Yao, Mingzhe Chen, Dijun Luo

cs.LG updates on arXiv.org arxiv.org

arXiv:2108.01295v2 Announce Type: replace
Abstract: Model-based reinforcement learning is a widely accepted solution for solving excessive sample demands. However, the predictions of the dynamics models are often not accurate enough, and the resulting bias may incur catastrophic decisions due to insufficient robustness. Therefore, it is highly desired to investigate how to improve the robustness of model-based RL algorithms while maintaining high sampling efficiency. In this paper, we propose Model-Based Double-dropout Planning (MBDP) to balance robustness and efficiency. MBDP consists of …

abstract arxiv bias cs.lg decisions dropout dynamics efficiency however planning predictions reinforcement reinforcement learning robustness sample solution type via

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US