Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? #1442

carstenbauer · 2022-03-15T11:30:09Z

Worked after #1119 was merged until (and including) 32023f3. Stopped working with 86b5069. Fixed on master by last commit, i.e. 1bab170.

The unfortunate thing is that there still is no working CUDA.jl release with Int8 WMMA support since 3.8.4 and 3.8.5 besides the new feature (#1119) also contain the culprit 86b5069. I wonder why the unit tests didn't detect this. Maybe we need more tests?

For some context, I want to use this feature over at GPUInspector.jl in peakflops(; dtype=Int8, tensorcores=true).

MWE:

function kernel_wmma_int8_lowlevel(a_dev, b_dev, c_dev, d_dev)
    a_frag = WMMA.llvm_wmma_load_a_col_m16n16k16_global_stride_s8(pointer(a_dev), 16)
    b_frag = WMMA.llvm_wmma_load_b_col_m16n16k16_global_stride_s8(pointer(b_dev), 16)
    c_frag = WMMA.llvm_wmma_load_c_col_m16n16k16_global_stride_s32(pointer(c_dev), 16)

    d_frag = WMMA.llvm_wmma_mma_col_col_m16n16k16_s8(a_frag, b_frag, c_frag)

    WMMA.llvm_wmma_store_d_col_m16n16k16_global_stride_s32(pointer(d_dev), d_frag, 16)
    return nothing
end

function launch_kernel_wmma_int8()
    m = n = k = 16
    dtype_a = dtype_b = Int8
    dtype_c = dtype_d = Int32
    d_a = CUDA.rand(dtype_a, m, k)
    d_b = CUDA.rand(dtype_b, k, n)
    d_c = CUDA.rand(dtype_c, m, n)
    d_d = CUDA.zeros(dtype_d, m, n)
    CUDA.@sync @cuda kernel_wmma_int8_lowlevel(d_a, d_b, d_c, d_d)
    return nothing
end

Output on 86b5069 and f37805e:

julia> launch_kernel_wmma_int8()
ERROR: UndefVarError: JuliaContext not defined
Stacktrace:
  [1] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/compiler/execution.jl:324
  [2] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler /scratch/pc2-mitarbeiter/bauerc/.julia/packages/GPUCompiler/I9fZc/src/cache.jl:90
  [3] cufunction(f::CUDA.var"#kernel#363", tt::Type{Tuple{CuDeviceMatrix{Int8, 1}, UInt32, UInt32}}; name::String, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/compiler/execution.jl:297
  [4] macro expansion
    @ /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/compiler/execution.jl:102 [inlined]
  [5] rand!(rng::CUDA.RNG, A::CuArray{Int8, 2, CUDA.Mem.DeviceBuffer})
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/random.jl:60
  [6] rand!(A::CuArray{Int8, 2, CUDA.Mem.DeviceBuffer})
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/random.jl:259
  [7] rand(T::Type, dim1::Int64, dims::Int64)
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/random.jl:273
  [8] launch_kernel_wmma_int8()
    @ Main ./REPL[3]:5
  [9] top-level scope
    @ REPL[4]:1
 [10] top-level scope
    @ /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/initialization.jl:52

The text was updated successfully, but these errors were encountered:

carstenbauer · 2022-03-15T11:34:38Z

(Technically, there is nothing to fix here, of course.)

maleadt · 2022-03-15T12:31:37Z

The unfortunate thing is that there still is no working CUDA.jl release with Int8 WMMA support since 3.8.4 and 3.8.5 besides the new feature (#1119) also contain the culprit 86b5069. I wonder why the unit tests didn't detect this. Maybe we need more tests?

I don't understand what you're saying here. #1119 isn't part of any release yet.

Stopped working with 86b5069.

How?

maleadt · 2022-03-15T12:33:27Z

Since you're pointing to 1bab170, which doesn't make much sense wrt. Int8 WMMA, I take it you're trying to use CUDA.jl from master without using the Manifest. This is not supported. The commit that 'broke' CUDA.jl added a dependency on a specific GPUCompiler commit to the Manifest.

carstenbauer · 2022-03-15T13:09:57Z

Since you're pointing to 1bab170, which doesn't make much sense wrt. Int8 WMMA, I take it you're trying to use CUDA.jl from master without using the Manifest. This is not supported. The commit that 'broke' CUDA.jl added a dependency on a specific GPUCompiler commit to the Manifest.

Ah, TIL. So my way of checking is flawed. (What I did was checking out particular commits of the repo and then ] deving the local repo in another Julia environment.)

I don't understand what you're saying here. #1119 isn't part of any release yet.

I looked at https://github.com/JuliaGPU/CUDA.jl/releases/tag/v3.8.4 but I see now that the PRs mentioned there didn't necessarily go into the release.

I'll close this then. Sorry for the noise!

maleadt · 2022-03-15T14:01:43Z

I looked at https://github.com/JuliaGPU/CUDA.jl/releases/tag/v3.8.4 but I see now that the PRs mentioned there didn't necessarily go into the release.

Yeah, that's confusing, I know... It's a long-standing issue with TagBot.jl: JuliaRegistries/TagBot#181

Generally I try to stick to released versions of dependencies, but there's been some invasive changes to GPUCompiler.jl I wanted to be live for CUDA.jl#master for a while, so that's why we commit the Manifest.

carstenbauer added the bug Something isn't working label Mar 15, 2022

carstenbauer changed the title ~~Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR~~ Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? Mar 15, 2022

carstenbauer closed this as completed Mar 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? #1442

Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? #1442

carstenbauer commented Mar 15, 2022

carstenbauer commented Mar 15, 2022

maleadt commented Mar 15, 2022

maleadt commented Mar 15, 2022

carstenbauer commented Mar 15, 2022 •

edited

Loading

maleadt commented Mar 15, 2022

Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? #1442

Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? #1442

Comments

carstenbauer commented Mar 15, 2022

carstenbauer commented Mar 15, 2022

maleadt commented Mar 15, 2022

maleadt commented Mar 15, 2022

carstenbauer commented Mar 15, 2022 • edited Loading

maleadt commented Mar 15, 2022

carstenbauer commented Mar 15, 2022 •

edited

Loading