all AI news
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
June 14, 2024, 4:46 a.m. | Han Zhong, Jiachen Hu, Yecheng Xue, Tongyang Li, Liwei Wang
cs.LG updates on arXiv.org arxiv.org
Abstract: While quantum reinforcement learning (RL) has attracted a surge of attention recently, its theoretical understanding is limited. In particular, it remains elusive how to design provably efficient quantum RL algorithms that can address the exploration-exploitation trade-off. To this end, we propose a novel UCRL-style algorithm that takes advantage of quantum computing for tabular Markov decision processes (MDPs) with $S$ states, $A$ actions, and horizon $H$, and establish an $\mathcal{O}(\mathrm{poly}(S, A, H, \log T))$ worst-case regret …
abstract algorithms arxiv attention case cs.ai cs.lg design exploitation exploration novel off quant-ph quantum reinforcement reinforcement learning replace stat.ml style trade trade-off type understanding while
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Data Engineer
@ Displate | Warsaw
Senior Algorithms Engineer (Image Processing)
@ KLA | USA-MI-Ann Arbor-KLA
Principal Software Development Engineer
@ Yahoo | US - United States of America
Data Domain Architect, Vice President
@ JPMorgan Chase & Co. | Columbus, OH, United States
Senior, Data Scientist, Sam's Personalization
@ Cox Enterprises | (USA) TX MCKINNEY 04906 SAM'S CLUB
Software Engineering Specialist
@ GE HealthCare | Bengaluru HEALTHCARE (JFWTC) IN