Skip to content

Change Log

wakemaster39 edited this page Aug 29, 2017 · 2 revisions

This page is going to server as our mechanism to let you know about changes we have made so you can track the project of snek.ai. We will do our best to add some details about why we changed it, and also let you know about brain resets but this isn't meant to be a full breakdown of our code.

We are happy to answer any questions in chat and take suggestions about what to try next though.

August 28, 2017

  • Brain wipe
  • Pretty much an entire code base rewrite. We had a number of improvements that led to the discovery of critical bugs and flaws in our existing code base. This pretty much impacted every aspect of what we were doing so while we were fixing that we also made another other improvements as well
  • We are now using googles A3C style algorithm. We are working in a discretized space and not using a continuous output mostly because of the boost functionality and not being familiar enough with how the math works itself out in this case.
  • Moved to a distributed tensorflow and removed all Keras references. This was a personal preference as I felt that keras was hiding a few too many of the implementation details that I wanted to get my hands on. This also will hopefully allow us to further scale the learning process.

August 22, 2017

  • Brain wipe
  • Model was updated again. We have evolved from the back propagation of discounted rewards to using a Actor/Critic model. The actor is a policy gradient with a learning rate of 10-3 and the DQN Critic has a learning rate of 10-4. This will hopefully lead to a smarter bot that better that makes better short and long term decisions.
  • A number of backend upgrades to try and remove some bugs and improve efficiency

August 21, 2017

  • Brain wipe
  • No updates will be given before this date, too numerous to count
  • Model was updated, we attempted a couple iterations of changing the controls to mouse movements. This is working but the model is a little odd. It appears to have resulted in a lot of random movements and no exactly the intended result we were looking for.
  • Multiprocessing agent rewrite was implemented. We made snek.ai able to have multiple friends running simultaneously. There is still alot of testing to be done surrounding the agent structure but it looks promising and hopefully more entertaining.
  • Rescaled rewards based on new controls
Clone this wiki locally