Suggestion to add how to implement pre-trained policies. #2325

bara-bba · 2022-05-11T10:37:22Z

Hi! I'm currently working to implement a policy trained using SAC in mujoco into a real robot. I'm trying to load the two q-functions but I obtain weird result in q_loss and the returns. Any suggestion in how to load correctly the policy? Thanks!

krzentner · 2022-05-20T04:45:40Z

I would recommend comparing the observation distribution between your real-world environment and simulated environment. The most obvious difficulties in sim2real transfer are due to that mismatch.
Note that the docs already describe how to use a pre-trained policy.
Continuing to train after transferring from sim2real is an active area of research, and I don't have a firm recommendation for how to achieve it. In particular, a Q function describing a policy's behavior in simulation is likely to over-estimate the performance of that policy on the real environment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion to add how to implement pre-trained policies. #2325

Suggestion to add how to implement pre-trained policies. #2325

bara-bba commented May 11, 2022

krzentner commented May 20, 2022

Suggestion to add how to implement pre-trained policies. #2325

Suggestion to add how to implement pre-trained policies. #2325

Comments

bara-bba commented May 11, 2022

krzentner commented May 20, 2022