all AI news
Tiered Reward Functions: Specifying and Fast Learning of Desired Behavior
Feb. 19, 2024, 5:43 a.m. | Zhiyuan Zhou, Shreyas Sundara Raman, Henry Sowerby, Michael L. Littman
cs.LG updates on arXiv.org arxiv.org
Abstract: Reinforcement-learning agents seek to maximize a reward signal through environmental interactions. As humans, our job in the learning process is to design reward functions to express desired behavior and enable the agent to learn such behavior swiftly. In this work, we consider the reward-design problem in tasks formulated as reaching desirable states and avoiding undesirable states. To start, we propose a strict partial ordering of the policy space to resolve trade-offs in behavior preference. We …
abstract agent agents arxiv behavior cs.ai cs.lg design environmental express functions humans interactions job learn process reinforcement signal through type work
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York