all AI news
The Update-Equivalence Framework for Decision-Time Planning
March 19, 2024, 4:45 a.m. | Samuel Sokota, Gabriele Farina, David J. Wu, Hengyuan Hu, Kevin A. Wang, J. Zico Kolter, Noam Brown
cs.LG updates on arXiv.org arxiv.org
Abstract: The process of revising (or constructing) a policy at execution time -- known as decision-time planning -- has been key to achieving superhuman performance in perfect-information games like chess and Go. A recent line of work has extended decision-time planning to imperfect-information games, leading to superhuman performance in poker. However, these methods involve solving subgames whose sizes grow quickly in the amount of non-public information, making them unhelpful when the amount of non-public information is …
abstract arxiv chess cs.ai cs.lg decision framework games information key line performance planning poker policy process superhuman type update work
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
#13721 - Data Engineer - AI Model Testing
@ Qualitest | Miami, Florida, United States
Elasticsearch Administrator
@ ManTech | 201BF - Customer Site, Chantilly, VA