HumanoidRobotWalk

Implementation of Trust Region Policy Optimization and Proximal Policy Optimization algorithms on the objective of Robot Walk.

Programs & libraries needed in order to run this project

OpenAI Gym : A toolkit for developing and comparing reinforcement learning algorithms
PyBullet Gym : PyBullet Robotics Environments fully compatible with Gym toolkit (uses the Bullet physics engine)
PyTorch : Open source machine learning library based on the Torch library
NumPy : Fundamental package for scientific computing with Python
matplotlib : Plotting library for the Python programming language and its numerical mathematics extension NumPy

Algorithms pseudocodes

Trust Region Policy Optimization (TRPO) - implemented by Vasilije Pantić

Proximal Policy Optimization (PPO) - implemented by Nikola Zubić

How to run?

For TRPO: Run trpo_main.py at root/code/trpo/,
For PPO: Run ppo_main.py at root/code/ppo/,
and enter the absolute file path to the trained model.

Trained models are available at: root/code/trained_models/.

In motion

TRPO

PPO

Numerical results

Training time [h]	24	96
TRPO

Training time [h]	6.5	48
PPO

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

HumanoidRobotWalk

Programs & libraries needed in order to run this project

Algorithms pseudocodes

Trust Region Policy Optimization (TRPO) - implemented by Vasilije Pantić

Proximal Policy Optimization (PPO) - implemented by Nikola Zubić

How to run?

In motion

TRPO

PPO

Numerical results

Files

README.md

Latest commit

History

README.md

File metadata and controls

HumanoidRobotWalk

Programs & libraries needed in order to run this project

Algorithms pseudocodes

Trust Region Policy Optimization (TRPO) - implemented by Vasilije Pantić

Proximal Policy Optimization (PPO) - implemented by Nikola Zubić

How to run?

In motion

TRPO

PPO

Numerical results