Hello Hello,

Heads up, it might sound complicated but it is a simple idea.

I have a RL agent trying to solve a certain problem, with training using PPO. I also have an expert, i.e. an agent that already knows how to tackle the given problem. I am assuming that in simulations, I have access to the expert policy (meaning I can easily generate trajectories using the expert). I am trying to use the expert to help with speeding up …

