all AI news
Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-free RL. (arXiv:2206.14057v1 [cs.LG])
June 29, 2022, 1:11 a.m. | Ruiquan Huang, Jing Yang, Yingbin Liang
stat.ML updates on arXiv.org arxiv.org
While the primary goal of the exploration phase in reward-free reinforcement
learning (RF-RL) is to reduce the uncertainty in the estimated model with
minimum number of trajectories, in practice, the agent often needs to abide by
certain safety constraint at the same time. It remains unclear how such safe
exploration requirement would affect the corresponding sample complexity to
achieve the desired optimality of the obtained policy in planning. In this
work, we make a first attempt to answer this question. …
More from arxiv.org / stat.ML updates on arXiv.org
Nuisance Function Tuning for Optimal Doubly Robust Estimation
2 days, 19 hours ago |
arxiv.org
CHANI: Correlation-based Hawkes Aggregation of Neurons with bio-Inspiration
3 days, 19 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer
@ GPTZero | Toronto, Canada
ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)
@ HelloBetter | Remote
Doctoral Researcher (m/f/div) in Automated Processing of Bioimages
@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Principal Data Architect - Azure & Big Data
@ MGM Resorts International | Home Office - US, NV
GN SONG MT Market Research Data Analyst 11
@ Accenture | Bengaluru, BDC7A