Web: https://www.reddit.com/r/reinforcementlearning/comments/scctub/dagent_always_taking_suboptimal_action_with_high/

Jan. 25, 2022, 12:41 p.m. | /u/amjass12

Reinforcement Learning reddit.com


My question is related slightly to a similar post i made recently (https://www.reddit.com/r/reinforcementlearning/comments/s7ptys/d_ddpg_not_converging_were_actor_critic_and_dqn/hted26u/?context=3) . I am training an actor-critic using a keras implementation which I have tested and works well on cartpole (although a simple environment, I dont think there are specific bugs in the implementation) - my task is a graph optimisation in which the agent makes or removes connections on an adjacency matrix (encoded as 0 or 1 for no connection or connection respectively) - …


Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY

Data Analyst

@ Colorado Springs Police Department | Colorado Springs, CO

Predictive Ecology Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX