Web: https://www.reddit.com/r/reinforcementlearning/comments/sd4144/average_reward_algorithms_when_reward/

Jan. 26, 2022, 11:39 a.m. | /u/fedetask

Reinforcement Learning reddit.com

Assume my RL agent has to manage the load balancing of a series of servers. This problem fits very well in the average reward formulation, since we do not have episodes, but only an infinite-length task where we want to optimize the average throughput and minimize the average delay.

Now, assume that the traffic on servers is high during the day and low during the night. Therefore, the average reward that the agent can achieve will depend on the time …

algorithms distribution reinforcementlearning

Senior Data Analyst

@ Fanatics Inc | Remote - New York

Data Engineer - Search

@ Cytora | United Kingdom - Remote

Product Manager, Technical - Data Infrastructure and Streaming

@ Nubank | Berlin

Postdoctoral Fellow: ML for autonomous materials discovery

@ Lawrence Berkeley National Lab | Berkeley, CA

Principal Data Scientist

@ Zuora | Remote

Data Engineer

@ Veeva Systems | Pennsylvania - Fort Washington