Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradient disaster problem encountered during training #12

Open
gsynb opened this issue Jun 26, 2024 · 5 comments
Open

Gradient disaster problem encountered during training #12

gsynb opened this issue Jun 26, 2024 · 5 comments

Comments

@gsynb
Copy link

gsynb commented Jun 26, 2024

Referring to the code usage examples, I think the usage methods of fastkan and efficient_kan should be consistent. I tried to replace MLP with KAN in my model. Using efficient_kan is normal, but using fastkan always causes the training loss to be nan. I haven't found the reason yet, so I'd like to ask if you have any solutions or insights?

@ZiyaoLi
Copy link
Owner

ZiyaoLi commented Jun 26, 2024

plz provide more information about the nan issue, such as creating the smallest code sample to reproduce the problem

@gsynb
Copy link
Author

gsynb commented Jun 26, 2024

I tried to reproduce this problem with simple code, but was unsuccessful. I used this model https://github.com/QuantumLab-ZY/HamGNN And replaced the fully connected layer with KAN, however, eficientkan can operate normally, and fastkan keeps showing nan

@EladWarshawsky
Copy link

I have the very same issue

@EladWarshawsky
Copy link

I make a kan with the fastkanlinear layers rather than a mlp with linear layers and the loss comes out to a nan

@ZiyaoLi
Copy link
Owner

ZiyaoLi commented Aug 7, 2024

@EladWarshawsky @gsynb plz provide more input such as implementation and training curves for your question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants