Contains Solutions of Lab assignments of Reinforcement Learning lab
see wiki for documentation and getting started.
Probability And Statistics:
Markov Chain, Sampling from Distributions,
Multi Arm Bandits:
Study of algorithms Like UCB, Thompson Sampling, Epsilon Greedy, Reinforce, Softmax for Multi Arm Bandits Problem with Bernoulli and Gaussian reward distribution.
DP Methods for RL:
Policy And Value Iteration for GridWorld
Model Free RL Algorithms:
MonteCarlo Control, SARSA, Q-Learning for MountainCar (Continious env), Taxi (discrete env).
Linear Function Approximation and Policy Gradients:
MonteCarlo Control, SARSA, Q-Learning with function approximation, DQN and A2C
Literature survey, implementation and evaluation of Proximal Policy Optimization for various tasks.
Other codes and assignments following various MOOCs.