May 22, 2024, 4:43 a.m. | Alexander Bork, Debraj Chakraborty, Kush Grover, Jan Kretinsky, Stefanie Mohr

cs.LG updates on arXiv.org arxiv.org

arXiv:2401.07656v3 Announce Type: replace-cross
Abstract: Strategies for partially observable Markov decision processes (POMDP) typically require memory. One way to represent this memory is via automata. We present a method to learn an automaton representation of a strategy using a modification of the L*-algorithm. Compared to the tabular representation of a strategy, the resulting automaton is dramatically smaller and thus also more explainable. Moreover, in the learning process, our heuristics may even improve the strategy's performance. In contrast to approaches that …

abstract algorithm arxiv automaton cs.ai cs.lg cs.lo decision learn markov memory observable processes replace representation strategies strategy tabular type via

Senior Data Engineer

@ Displate | Warsaw

Analyst, Data Analytics

@ T. Rowe Price | Owings Mills, MD - Building 4

Regulatory Data Analyst

@ Federal Reserve System | San Francisco, CA

Sr. Data Analyst

@ Bank of America | Charlotte

Data Analyst- Tech Refresh

@ CACI International Inc | 1J5 WASHINGTON DC (BOLLING AFB)

Senior AML/CFT & Data Analyst

@ Ocorian | Ebène, Mauritius