-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unstable learning #70
Comments
I incorporated the training/testing into the same pipeline in the latest commit. I also incorporated an orthogonal weights initialization which helps making the training more table. You can set --eval_scheme=5-fold-cv-standalone-test which will perform a train/valid/test like this:
You can also simply run a 5-fold cv --eval_scheme=5-fold-cv There were some issues with the testing script when loading pretrained weights (i.e., sometimes the weights are not fully loaded or there are missing weights, setting strict=False can reveal the problems.). The purpose of the testing script is to generate the heatmap, you should now read the performance directly from the training script. I will fix the issues in a couple of days. |
Hi
Thank you for your great work.
I have a problem with training of your model.
This is the curve for score and test (validation) loss for some different hyper parameters on TCGA data:
21.pth -> lr = 2e-4, wd = 5e-3
1.pth -> lr = 2e-4, wd = 5e-4
2.pth -> lr = 2e-4, wd = 5e-4 (1 and 2 have same hyper parameters but just different random initializations)
5.pth -> lr = 2e-4, wd = 1e-4
18.pth -> lr = 2e-4, wd = 1e-7
19.pth -> lr = 2e-5, wd = 5e-7
Based on what I see the training seems unstable because auc scores are getting worse (or at least seem to have relatively unstable behavior) during time but test loss which is validation loss is getting less meaning that overfitting is not the case. Can you please explain what's the reason and what's happening here?
And also why did you set 200 epochs for learning while I haven't seen even 1 one model in 21 different hyper parameters to get updated after epoch 7?
Another very strange thing is that 19.pth has the best auc the attentions weights of it is very bad (all of them are 0) which is very odd.
I think this is still a continuation of #61 (comment).
Thank you
The text was updated successfully, but these errors were encountered: