This repo implements the Plan-to-explore algorithm from Planning to Explore via Self-Supervised World Models based on the PlaNet-Pytorch. It has been confirmed working on the DeepMind Control Suite/MuJoCo environment. Hyperparameters have been taken from the paper.
To install all dependencies with Anaconda run using the following commands. Firstly use conda.
pip install -r requirements.txt
Zero-shot
python main.py --algo p2e --env walker-walk --action-repeat 2 --id name-of-experiement --zero-shot
Few-shot
python main.py --algo p2e --env walker-walk --action-repeat 2 --id name-of-experiement
For best performance with DeepMind Control Suite, try setting environment variable MUJOCO_GL=egl
(see instructions and details here).
We used weights and biases for logging the runs.
You can see the performance from the zero-shot/few-shot trained policy on the test/episode_reward
.