May 22, 2024, 4:43 a.m. | Alexander Bork, Debraj Chakraborty, Kush Grover, Jan Kretinsky, Stefanie Mohr

cs.LG updates on arXiv.org arxiv.org

arXiv:2401.07656v3 Announce Type: replace-cross
Abstract: Strategies for partially observable Markov decision processes (POMDP) typically require memory. One way to represent this memory is via automata. We present a method to learn an automaton representation of a strategy using a modification of the L*-algorithm. Compared to the tabular representation of a strategy, the resulting automaton is dramatically smaller and thus also more explainable. Moreover, in the learning process, our heuristics may even improve the strategy's performance. In contrast to approaches that …

abstract algorithm arxiv automaton cs.ai cs.lg cs.lo decision learn markov memory observable processes replace representation strategies strategy tabular type via

Senior Data Engineer

@ Displate | Warsaw

Senior Robotics Engineer - Applications

@ Vention | Montréal, QC, Canada

Senior Application Security Engineer, SHINE - Security Hub for Innovation and Efficiency

@ Amazon.com | Toronto, Ontario, CAN

Simulation Scientist , WWDE Simulation

@ Amazon.com | Bellevue, Washington, USA

Giáo Viên Steam

@ Việc Làm Giáo Dục | Da Nang, Da Nang, Vietnam

Senior Simulation Developer

@ Vention | Montréal, QC, Canada