The goal is to train Deep Reinforcement Learning (DRL) agents that receive image-inputs from a game simulator, and that output game actions to play the game autonomously. The following simulator will be used to play the game of SuperMarioBros 1-1-v0: https://github.com/Kautenja/gym-super-mario-bros
Evaluated the agents using metrics sush as Avg. Reward, Avg. Q-Value, Avg. Game Score, Avg. Steps Per Episode, and Training and Test Times.
Trained at least three different agents (in addition to any baseline), which can differ in their state representation (CNN, CNN-RNN, CNN-Transformer) and/or different learning algorithms. and reports with three different seeds and average their results.
Conclusion: The results were not as good as expected. One of the reasons could be the number of training epochs. Also, it took four hours to train one agent.