Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates for pytorch 0.4 #2

Open
esvhd opened this issue Jul 23, 2018 · 2 comments
Open

Updates for pytorch 0.4 #2

esvhd opened this issue Jul 23, 2018 · 2 comments

Comments

@esvhd
Copy link

esvhd commented Jul 23, 2018

Hi there,

Has anyone tried to update this code for pytorch 0.4? awd-lstm-lm repo recently upgraded to make the models work in 0.4.

I made some attempts but got stuck at gradstat(), since in this method the code calls model.eval(), I encountered the following error while running with a trained model. See code here

Traceback (most recent call last):
  File "dynamiceval.py", line 299, in <module>
    gradstat(args, corpus, model, train_data, criterion)
  File "dynamiceval.py", line 83, in gradstat
    loss.backward()
  File "/usr/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/usr/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 89, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: backward_input can only be called in training mode

Looks like I cannot call loss.backward() when the model is set to eval mode. I've not tried running in with previous versions of pytorch. Is there a workaround for this in version 0.4?

Thanks.

@benkrause
Copy link
Owner

Hi,

Thanks for raising this issue to my attention. Nice job getting so far, I looked into this myself and ran into the same problem, it appears that in pytorch 0.4 it is not possible to call loss.backward() on an RNN in eval mode, and we don't want to be using dropout so using train mode doesn't make sense. One hacky workaround that should work is to set all the dropout parameters of the model to 0 and use model.train() instead of model.eval(). I'm hoping to find a better solution though. I raised an issue about this in pytorch: pytorch/pytorch#10006 . I'll let you know when I have a 0.4 compatible version.

Thanks!
Ben

@esvhd
Copy link
Author

esvhd commented Jul 30, 2018

Hi Ben,

Thank you for getting back on this. Very keen to get this working w/ the latest pytorch. Looking forward to hearing your solution!

Best,
esvhd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants