all AI news
Q-learning and Sarsa in grid environment for short-term vs long-term rewards
Jan. 11, 2022, 10:07 a.m. | /u/studentani
Artificial Intelligence www.reddit.com
I created my custom, grid(7 by 7) environment to apply RL algorithms. I chose Q-learning and Sarsa, in particular.
The grid environment consists of 3 types of terminating states: states with negative reward(-100), state with maximum reward(100) and 2 states with half reward(50).
The main goal of training is for the agent to avoid states with negative rewards and to prefer long-term reward(100) over short-term half reward(50).
The trained agent works weirdly when the half-rewarded state is closer to the …
!-->More from www.reddit.com / Artificial Intelligence
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior AI & Data Engineer
@ Bertelsmann | Kuala Lumpur, 14, MY, 50400
Analytics Engineer
@ Reverse Tech | Philippines - Remote