Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi layer RNN #79

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

multi layer RNN #79

wants to merge 2 commits into from

Conversation

WoodySG2018
Copy link

coded multi layer RNN. Using 2 layers as default, can change "num_layers" to adjust number of layers when calling DecoderWithAttention. Please check.

WoodySG2018 and others added 2 commits July 3, 2019 17:23
coded multi layer RNN. Using 2 layers as default, can change "num_layers" to adjust number of layers when calling DecoderWithAttention. Please check.
@kmario23
Copy link
Contributor

Does it achieve improved results when compared to a single layer one? If yes, by what margin?

@WoodySG2018
Copy link
Author

Does it achieve improved results when compared to a single layer one? If yes, by what margin?

Hi kmario23, the training is on-going, by now I saw clear different loss decreasing between 2-layer and 4-layer RNN (I can't trace back single layer RNN performance), but it's too early to tell relation of performance vs #layer on my working dataset.

Theoretically to capture highly hierarchical structure by just one layer is not optimal. So hopefully by integrating multilayer here we can enable users to test the model's power in another dimension on their own datasets.

@nilinykh
Copy link

nilinykh commented Jan 6, 2020

Is there any update on this issue? Will it be merged? A very nice functionality, I would say.

Just a quick question related to how initial states for multi-layer LSTM are constructed: I am wondering why we need to initialize hidden/cell states of the LSTM by passing image representation through FC layer? Is it somehow the standard way of doing it?

@kmario23
Copy link
Contributor

kmario23 commented Jan 7, 2020

Just a quick question related to how initial states for multi-layer LSTM are constructed: I am wondering why we need to initialize hidden/cell states of the LSTM by passing image representation through FC layer? Is it somehow the standard way of doing it?

That's how we pass the learned representation of the image by the CNN (encoder) to the RNN/LSTM (decoder). Usually the decoder dimension varies in size (i.e. a hyperparameter). So, to match the encoder dimension to the decoder dimension, we usually have to project the learned representation of image i.e. the encoder output, which is of dimension 2048 in this case, to the decoder dimension. Else it's not possible to feed the encoder output directly into the decoder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants