all AI news
Minimum information divergence of Q-functions for dynamic treatment resumes. (arXiv:2211.08741v1 [stat.ME])
Nov. 17, 2022, 2:13 a.m. | Shinto Eguchi
stat.ML updates on arXiv.org arxiv.org
This paper aims at presenting a new application of information geometry to
reinforcement learning focusing on dynamic treatment resumes. In a standard
framework of reinforcement learning, a Q-function is defined as the conditional
expectation of a reward given a state and an action for a single-stage
situation. We introduce an equivalence relation, called the policy equivalence,
in the space of all the Q-functions. A class of information divergence is
defined in the Q-function space for every stage. The main objective …
More from arxiv.org / stat.ML updates on arXiv.org
Learning linear dynamical systems under convex constraints
1 day, 22 hours ago |
arxiv.org
Inverse Unscented Kalman Filter
2 days, 22 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne