-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduce the GAT v1 attention matrix #5
Comments
Hi @ALEX13679173326 ! This is one of the heads of a single layer of GAT/GATv2, trained on the DictionaryLookup problem (Figure 2). Does that answer your questions? Feel free to let us know if anything is unclear. |
Thanks very much for your reply!! Recently, I find there is the same pattern in the attention matrix in ViT(Vision Transformer), which also uses self-attention mechanism. If we regard ViT as a graph model, I think this phenomenon may have connection with GAT. In my immature opinion, this phenomenon in Figure 1(a) may be related to some potential weakness of the self-attention mechanism. Have you researched the cause of this phenomenon? Thanks again! |
Our main analysis is on the GAT formulation. |
Thanks for your great contribution!!
I'm confused about Figure 1 (a) in your paper. Which layer of GAT is this attention matrix in? Is the attention matrix of all layers the same? Is the attention matrix between different heads in one layer like this?
Best regards
The text was updated successfully, but these errors were encountered: