July 5, 2022, 3:11 p.m. | Edgar A Aguilar

Towards Data Science - Medium towardsdatascience.com

Innocent changes to RL reward functions produce surprising behaviors

Image by Stefan Keller from Pixabay.

Problem Setting

Reinforcement Learning is a popular branch of AI/ML that tries to learn optimal behavior by interacting with the world and maximizing a reward signal. For example, when playing games the agent will make a decision on how to interact with the world, and the environment game will reward accordingly.

Atari Breakout. Here the agent’s reward signal is the score of the game. Environment …

agents deep learning optimization reinforcement learning rl

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York