Video caption generation with Encoder-Decoder model

In this project, we developed basic Encoder-Decoder model, and S2VT model to generate video captions. In addition, we also applied Attention machnism to improve performance.

Getting Started

The following instructions will get you a copy of the project and running on your local machine for testing purposes.

Prerequisite & Toolkits

The following are some toolkits and their version you need to install for running this project

Python 3.6 - The Python version used
Pytorch 0.3 - Deep Learning for Python
Pandas 0.21.0 - Data Analysis Library for Python

In addition, it is required to use GPU to run this project.

Model Structures

The following are the model structures we implemented in Pytorch from scratch:

[Baseline Model]
[S2VT Model] In order to improve performance, we also implemented Bahdanau Attention and Luong Attention

Reference

[1] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
[2] Minh-Thang Luong, Hieu Pham, Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation
[3] Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer. 2015. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
[4] Natsuda Laokulrat, Sang Phan, Noriki Nishida. 2016. Generating Video Description using Sequence-to-sequence Model with Temporal Attention

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
S2VT.PNG		S2VT.PNG
attention.PNG		attention.PNG
baseline.PNG		baseline.PNG
hw2_1_i2w		hw2_1_i2w
hw2_1_w2i		hw2_1_w2i
hw2_seq2seq.sh		hw2_seq2seq.sh
model_seq2seq.py		model_seq2seq.py
predict.py		predict.py
report.pdf		report.pdf
s2vt_model		s2vt_model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video caption generation with Encoder-Decoder model

Getting Started

Prerequisite & Toolkits

Model Structures

Reference

About

Releases

Packages

Languages

CKRC24/Seq2seq-on-video-caption-generation

Folders and files

Latest commit

History

Repository files navigation

Video caption generation with Encoder-Decoder model

Getting Started

Prerequisite & Toolkits

Model Structures

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages