Jan. 23, 2022 | /u/AhmedNizam_

I have an RL agent which has to move to a certain target location. Currently, the area is 1000x1000 units and agent moves at 100 units/step (in any direction). At each step, the agent gets a positive reward for moving towards the target, and a negative reward otherwise (hence the reward is shaped; the agent does not get anything more at the end of the episode). This works perfectly fine, and the agent learns to go to the target. …

