It contains the character level language modelling in tinyshakespheare dataset using Recurrent Neural Networks implemented with PyTorch. The data set and a well commented jupyter notebook is added in this folder. The jupyter notebook is orignally a Google Colab notebook. If you find it difficult to reproduce the results locally, try it on Google Colaboratory. Anyway you can find the original work at here.
A bunch of sample
folders are added in this folder where you can find the performance of the network at different hyper parameter settings. Each folder will contain a generated text file, a loss vs iterations graph and the saved trained model which can be reused using PyTorch. This is planned as a programming session for this blog post in my personal blog.
This folder contains the different bible version data in JSON format. The versions are,
- American Standard-ASV1901 (ASV)
- Bible in Basic English (BBE)
- Darby English Bible (DARBY)
- King James Version (KJV)
- Webster's Bible (WBT)
- Young's Literal Translation (YLT)
Bible data is originally dowloaded from this github repo.
This folder includes the raw gospel text extracted from the bible database and stored as JSON files.
This folder contains the data for training the LSTMs which are generated from the files in raw_data
folder.
This folder contains the Jupyter Notebooks which shows the demo of data fetching, cleaning EDA of bible stats. Bible stats is available in JSON format at root folder.
The scripts for cleaning and converting raw data to training data is included in the folder. The demos in notebook
folder comes in action here.
The folder contains the best trained model, loss vs epochs graph and stats of training.
The folder contains the training results on different validation sets. Using gospel of Mark as validation shows less convergence.
Some generated samples using trained model using warmup context.
Medium Post: https://medium.com/@sleebapaul/gospel-of-lstms-how-i-wrote-5th-gospel-of-bible-using-lstms-4cffa70e5f1a
Google Colab Link: https://colab.research.google.com/drive/1euakjbNiZgCfbmCWzT6pIZB2MYZbHjk-
The trained model can be reproduced in Colab Notebook. Though, I've added the copy of Colab Notebook as pytorch_LSTM.ipynb
for the local reference, I recommend the reader to make efforts for reproducing the results at Google Colaboratory.
I've added the Gospel of LSTMs epub
version in the repo. Please use it for better reading experience.
Happy Learning !