Skip to content

Implementing reinforcement learning and several deep reinforcement learning models for OpenAI's lunar lander

Notifications You must be signed in to change notification settings

ziadzee/Reinforcement-Leanring-with-Open-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 

Repository files navigation

Reinforcement Learning Implementations

  • Implemented a Q-Learning algorithm on Grid World environment in which an agent (mouse) navigates her grid environment collecting rewards (cheese) with the goal of escaping the environment. The agent has the actions of Up, Down, Left and Right which allows her to move between states.

  • Implemented the Soft Actor-Critic model proposed by Haarnoja et al., (2018) to Open AI's Lunar Lander problem.

  • Implemented and carried out a comparative performance of Double DQN, Dueling DQN and Prioritised DQN for solving Lunar Lander problem.

Packages

  • Python Version: 3.8.11

  • Libraries and Packages: numpy , pandas, seaborn, operator, torch, gym, plotly , random , collections

Outcomes

Q-Learning Grid World

  • For our Q-Learning algorithmL, the random policy performs poorly and in an unpredictiable manner as expected:

image

  • For different values for Alpha (learning rate) , Gamma (the discount factor) and Epsilon (action selection):

image

image

image

Soft Actor-Critic (Deep Reinforcement Learning)

  • We found that the Soft Actor-Critic didn't perform very well in our implementation for the Lunar Lander problem. The agent behaved in a very stochastic manner and failed to adequately learn the rules of the game after many epochs:

image

Double DQN, Dueling DQN and Prioritised DQN (Deep Reinforcement Learning)

  • Lastly, out of the three DQN models the Prioritised DQN model performed best when considering the loss function. It is interesting to note that all three models achieved negative rewards across all epochs:

image

Specifications

Code and outcomes for the above can be found in the follow Jupyter notebooks:

  • Q-Learning Grid World: Q_Learning_Grid_World
  • Soft Actor-Critic: Soft_Actor_Critic
  • Double DQN, Dueling DQN and Prioritised DQN: DQN_comparison

For more information regarding Lunar Lander and OpenAI, see https://gym.openai.com/envs/LunarLander-v2/

About

Implementing reinforcement learning and several deep reinforcement learning models for OpenAI's lunar lander

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published