Sept. 22, 2022, 1:12 a.m. | Ryan Sullivan, J. K. Terry, Benjamin Black, John P. Dickerson

cs.LG updates on arXiv.org arxiv.org

Visualizing optimization landscapes has led to many fundamental insights in
numeric optimization, and novel improvements to optimization techniques.
However, visualizations of the objective that reinforcement learning optimizes
(the "reward surface") have only ever been generated for a small number of
narrow contexts. This work presents reward surfaces and related visualizations
of 27 of the most widely used reinforcement learning environments in Gym for
the first time. We also explore reward surfaces in the policy gradient
direction and show for the …

arxiv environments reinforcement reinforcement learning

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

AI Scientist/Engineer

@ OKX | Singapore

Research Engineering/ Scientist Associate I

@ The University of Texas at Austin | AUSTIN, TX

Senior Data Engineer

@ Algolia | London, England

Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)

@ BlackRock | NY7 - 50 Hudson Yards, New York

Snowflake Data Analytics

@ Devoteam | Madrid, Spain