Status: Under construction.
Amca is an RL-based Backgammon agent.
Dependency | Version Tested On |
---|---|
Ubuntu | 16.04 |
Python | 3.6.8 |
numpy | 1.15.4 |
gym | 0.10.9 |
Stable Baselines | 2.4.0a |
This project aims to design Backgammon as a reinforcement learning problem, and gauge the performance of common deep reinforcement learning algorithms. This is done by training and gauging the performance of three popular and powerful RL algorithms:
- Deep Q Network (Mnih et. al)
- Proximal Policy Optimization (Schulman et. al)
- Soft Actor-Critic (Haarnoja et. al)
- Sarsa (Rummery and Niranjan)
The testing is done with the default parameters and implementations provided by the Stable Baselines library for all the 3 deep RL algorithms. A custom implementation heavily modified from this repo is used for SARSA, and the hyperparameters are given in the SarsaAgent object.
- play.py: to launch a game against a deep RL trained model. For example,
python play.py ppo amca/models/amca.pkl
will launch the model calledamca.pkl
that was trained using the PPO algorithm. - train.py: to train an deep RL model (with default hyperparameters) to play. For example,
python train.py -n terminator.pkl -a sac -t 1000000
will train an agent calledterminator.pkl
using the SAC algorithm for 1000000 steps. - sarsa_play.py: to launch a game against a SARSA trained model.
python sarsa_play.py r2d2.pkl
will launch the model calledr2d2.pkl
that was trained using the SARSA algorithm. - sarsa_train.py: to train a model using SARSA. For example,
python sarsa_train.py jarvis.pkl -g 10000
will train an agent calledjarvis.pkl
using the SARSA algorithm for 10000 games.