all AI news
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes. (arXiv:2201.11206v1 [cs.LG])
Jan. 28, 2022, 2:10 a.m. | Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson
cs.LG updates on arXiv.org arxiv.org
Reward-free reinforcement learning (RL) considers the setting where the agent
does not have access to a reward function during exploration, but must propose
a near-optimal policy for an arbitrary reward function revealed only after
exploring. In the the tabular setting, it is well known that this is a more
difficult problem than PAC RL -- where the agent has access to the reward
function during exploration -- with optimal sample complexities in the two
settings differing by a factor of …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Scientist (m/f/x/d)
@ Symanto Research GmbH & Co. KG | Spain, Germany
AI Scientist/Engineer
@ OKX | Singapore
Research Engineering/ Scientist Associate I
@ The University of Texas at Austin | AUSTIN, TX
Senior Data Engineer
@ Algolia | London, England
Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)
@ BlackRock | NY7 - 50 Hudson Yards, New York
Snowflake Data Analytics
@ Devoteam | Madrid, Spain