-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gradient disaster problem encountered during training #12
Comments
plz provide more information about the nan issue, such as creating the smallest code sample to reproduce the problem |
I tried to reproduce this problem with simple code, but was unsuccessful. I used this model https://github.com/QuantumLab-ZY/HamGNN And replaced the fully connected layer with KAN, however, eficientkan can operate normally, and fastkan keeps showing nan |
I have the very same issue |
I make a kan with the fastkanlinear layers rather than a mlp with linear layers and the loss comes out to a nan |
@EladWarshawsky @gsynb plz provide more input such as implementation and training curves for your question. |
Referring to the code usage examples, I think the usage methods of fastkan and efficient_kan should be consistent. I tried to replace MLP with KAN in my model. Using efficient_kan is normal, but using fastkan always causes the training loss to be nan. I haven't found the reason yet, so I'd like to ask if you have any solutions or insights?
The text was updated successfully, but these errors were encountered: