April 29, 2022, 8:02 p.m. | blackburn

Towards AI - Medium pub.towardsai.net

Model-free learning.

In the preceding three blogs, we talked about how we formulate an RL problem as MDPs and solve them using Dynamic Programming (Value iteration and Policy iteration). Everything we have seen so far deals with the environments we had full knowledge about i.e.transition probability, reward function etc. A natural question would be, what about the environments in which we don’t have the luxury of these functions and probabilities? This is where a different class of reinforcement learning known …

deep learning learning machine learning monte-carlo reinforcement reinforcement learning

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Data Scientist (Database Development)

@ Nasdaq | Bengaluru-Affluence