You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @JonathanRaiman , first off thanks for setting up this repo. Learned a lot from your implementation.
I have a question regarding the masking operation for variable length inputs. You mentioned in docs that
Elementwise mutliply this mask with your output, and then apply your objective function to this masked output. The error will be obtained everywhere, but will be zero in areas that were masked, yielding the correct error function.
But by doing so you only cuts off the back-propagation (BP) paths from masked outputs, whereas the BP paths from unmasked outputs via masked hidden units remain. The resulting loss function is still errornous.
I notice from other LSTM implementations (e.g. Lasagne) that the usual approach is to use the mask to switch between previous hidden states (if current input is masked) and computed hidden states (otherwise). In this way, both type of unwanted BP paths (masked outputs -> masked hidden -> weight & unmasked outputs -> masked hidden -> weight.)
Plz correct me if I'm wrong. Am I missing something?
The text was updated successfully, but these errors were encountered:
Hi @JonathanRaiman , first off thanks for setting up this repo. Learned a lot from your implementation.
I have a question regarding the masking operation for variable length inputs. You mentioned in docs that
But by doing so you only cuts off the back-propagation (BP) paths from masked outputs, whereas the BP paths from unmasked outputs via masked hidden units remain. The resulting loss function is still errornous.
I notice from other LSTM implementations (e.g. Lasagne) that the usual approach is to use the mask to switch between previous hidden states (if current input is masked) and computed hidden states (otherwise). In this way, both type of unwanted BP paths (masked outputs -> masked hidden -> weight & unmasked outputs -> masked hidden -> weight.)
Plz correct me if I'm wrong. Am I missing something?
The text was updated successfully, but these errors were encountered: