Gradient disaster problem encountered during training #12

gsynb · 2024-06-26T03:15:58Z

Referring to the code usage examples, I think the usage methods of fastkan and efficient_kan should be consistent. I tried to replace MLP with KAN in my model. Using efficient_kan is normal, but using fastkan always causes the training loss to be nan. I haven't found the reason yet, so I'd like to ask if you have any solutions or insights?

ZiyaoLi · 2024-06-26T03:26:11Z

plz provide more information about the nan issue, such as creating the smallest code sample to reproduce the problem

gsynb · 2024-06-26T08:20:20Z

I tried to reproduce this problem with simple code, but was unsuccessful. I used this model https://github.com/QuantumLab-ZY/HamGNN And replaced the fully connected layer with KAN, however, eficientkan can operate normally, and fastkan keeps showing nan

EladWarshawsky · 2024-07-25T21:56:52Z

I have the very same issue

EladWarshawsky · 2024-07-25T21:57:31Z

I make a kan with the fastkanlinear layers rather than a mlp with linear layers and the loss comes out to a nan

ZiyaoLi · 2024-08-07T03:31:00Z

@EladWarshawsky @gsynb plz provide more input such as implementation and training curves for your question.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient disaster problem encountered during training #12

Gradient disaster problem encountered during training #12

gsynb commented Jun 26, 2024

ZiyaoLi commented Jun 26, 2024

gsynb commented Jun 26, 2024

EladWarshawsky commented Jul 25, 2024

EladWarshawsky commented Jul 25, 2024

ZiyaoLi commented Aug 7, 2024

Gradient disaster problem encountered during training #12

Gradient disaster problem encountered during training #12

Comments

gsynb commented Jun 26, 2024

ZiyaoLi commented Jun 26, 2024

gsynb commented Jun 26, 2024

EladWarshawsky commented Jul 25, 2024

EladWarshawsky commented Jul 25, 2024

ZiyaoLi commented Aug 7, 2024