Why SpatialLinearAttention use k mul v first? #30

TtuHamg · 2023-11-23T09:57:28Z

Hello, I'm a newcomer in Diffusion generation. I'd like to ask why in the SpatialLinearAttention, the context is obtained first using 'k' and 'v', which seems different from the typical self-attention mechanism (where attention coefficients are computed using 'q' and 'k'). Is there a specific reason for this approach or other paper mention its use? I hope to receive your explanation. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why SpatialLinearAttention use k mul v first? #30

Why SpatialLinearAttention use k mul v first? #30

TtuHamg commented Nov 23, 2023

Why SpatialLinearAttention use k mul v first? #30

Why SpatialLinearAttention use k mul v first? #30

Comments

TtuHamg commented Nov 23, 2023