xformers can't work #134

ghosthamlet · 2024-07-09T02:49:47Z

fp16 with --enable_xformers_memory_efficient_attention, failed with error:

NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(32, 4096, 1, 72) (torch.float16)
     key         : shape=(32, 300, 1, 72) (torch.float16)
     value       : shape=(32, 300, 1, 72) (torch.float16)
     attn_bias   : <class 'torch.Tensor'>
     p           : 0.0
`[email protected]` is not supported because:
    attn_bias type is <class 'torch.Tensor'>
`cutlassF` is not supported because:
    attn_bias.stride(-2) % 8 != 0 (attn_bias.stride() = (300, 0, 0, 1))
    HINT: To use an `attn_bias` with a sequence length that is not a multiple of 8, you need to ensure memory is aligned by slicing a bigger tensor. Example: use `attn_bias = torch.zeros([1, 1, 5, 8])[:,:,:,:5]` instead of `torch.zeros([1, 1, 5, 5])`
`smallkF` is not supported because:
    max(query.shape[-1] != value.shape[-1]) > 32
    dtype=torch.float16 (supported: {torch.float32})
    has custom scale
    unsupported embed per head: 72

is xformer must use with fp32?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xformers can't work #134

xformers can't work #134

ghosthamlet commented Jul 9, 2024

xformers can't work #134

xformers can't work #134

Comments

ghosthamlet commented Jul 9, 2024