Implementation of a 3-layer neural network with SGD (and SGD with momentum) as the optimizer. Trained and tested on MNIST dataset.