Tiered Reward Functions: Specifying and Fast Learning of Desired Behavior | allainews.com

Feb. 19, 2024, 5:43 a.m. | Zhiyuan Zhou, Shreyas Sundara Raman, Henry Sowerby, Michael L. Littman

cs.LG updates on arXiv.org arxiv.org

arXiv:2212.03733v2 Announce Type: replace
Abstract: Reinforcement-learning agents seek to maximize a reward signal through environmental interactions. As humans, our job in the learning process is to design reward functions to express desired behavior and enable the agent to learn such behavior swiftly. In this work, we consider the reward-design problem in tasks formulated as reaching desirable states and avoiding undesirable states. To start, we propose a strict partial ordering of the policy space to resolve trade-offs in behavior preference. We …

abstract agent agents arxiv behavior cs.ai cs.lg design environmental express functions humans interactions job learn process reinforcement signal through type work

More from arxiv.org / cs.LG updates on arXiv.org

CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks 7 hours ago | arxiv.org

abstract artificial artificial intelligence arxiv +28

PlasmoData.jl -- A Julia Framework for Modeling and Analyzing Complex Data as Graphs 7 hours ago | arxiv.org

abstract analyze applications arxiv +24

Approximating Numerical Fluxes Using Fourier Neural Operators for Hyperbolic Conservation Laws 7 hours ago | arxiv.org

abstract arxiv computational conservation +18

3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease Progression from Longitudinal OCTs 7 hours ago | arxiv.org

abstract arxiv cs.cv cs.lg +13

Data Needs and Challenges of Quantum Dot Devices Automation: Workshop Report 7 hours ago | arxiv.org

abstract arxiv automation block +22

Prospects for AI-Enhanced ECG as a Unified Screening Tool for Cardiac and Non-Cardiac Conditions -- … 7 hours ago | arxiv.org

abstract accuracy algorithms analysis +18

Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications 7 hours ago | arxiv.org

abstract applications apply approximation +17

SparseGS: Real-Time 360{\deg} Sparse View Synthesis using Gaussian Splatting 7 hours ago | arxiv.org

abstract advance arxiv cs.cv +16

Covering Number of Real Algebraic Varieties and Beyond: Improved Bounds and Applications 7 hours ago | arxiv.org

abstract algorithms analysis applications +16

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net