Jan. 10, 2022, 4:14 a.m. | /u/ddcfefff

Machine Learning www.reddit.com

Is it better to have the reward function be better when the agent makes a good move, or if it’s in a good state. Eg: the reward for an agent in a good state that makes a net negative move is higher than that of an agent in a bad state that makes a net negative move or vice versa.

The most basic example I can think is if you have an env with a “target” input of 1 or …

machinelearning rl

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Sr. Data Science Consultant

@ Blue Yonder | Bengaluru

Artificial Intelligence Developer

@ HP | PSR01 - Bengaluru, Pritech Park- SEZ (PSR01)

Senior Software Engineer - Cloud Data Extraction

@ Celonis | Munich, Germany

Finance Master Data Management

@ Airbus | Lisbon (Airbus Portugal)

Imaging Support Associate

@ Lexington Medical Center | West Columbia, SC, US, 29169