Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? #1442

Closed
carstenbauer opened this issue Mar 15, 2022 · 5 comments
Labels
bug Something isn't working

Comments

@carstenbauer
Copy link
Member

Worked after #1119 was merged until (and including) 32023f3. Stopped working with 86b5069. Fixed on master by last commit, i.e. 1bab170.

The unfortunate thing is that there still is no working CUDA.jl release with Int8 WMMA support since 3.8.4 and 3.8.5 besides the new feature (#1119) also contain the culprit 86b5069. I wonder why the unit tests didn't detect this. Maybe we need more tests?

For some context, I want to use this feature over at GPUInspector.jl in peakflops(; dtype=Int8, tensorcores=true).

MWE:

function kernel_wmma_int8_lowlevel(a_dev, b_dev, c_dev, d_dev)
    a_frag = WMMA.llvm_wmma_load_a_col_m16n16k16_global_stride_s8(pointer(a_dev), 16)
    b_frag = WMMA.llvm_wmma_load_b_col_m16n16k16_global_stride_s8(pointer(b_dev), 16)
    c_frag = WMMA.llvm_wmma_load_c_col_m16n16k16_global_stride_s32(pointer(c_dev), 16)

    d_frag = WMMA.llvm_wmma_mma_col_col_m16n16k16_s8(a_frag, b_frag, c_frag)

    WMMA.llvm_wmma_store_d_col_m16n16k16_global_stride_s32(pointer(d_dev), d_frag, 16)
    return nothing
end

function launch_kernel_wmma_int8()
    m = n = k = 16
    dtype_a = dtype_b = Int8
    dtype_c = dtype_d = Int32
    d_a = CUDA.rand(dtype_a, m, k)
    d_b = CUDA.rand(dtype_b, k, n)
    d_c = CUDA.rand(dtype_c, m, n)
    d_d = CUDA.zeros(dtype_d, m, n)
    CUDA.@sync @cuda kernel_wmma_int8_lowlevel(d_a, d_b, d_c, d_d)
    return nothing
end

Output on 86b5069 and f37805e:

julia> launch_kernel_wmma_int8()
ERROR: UndefVarError: JuliaContext not defined
Stacktrace:
  [1] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/compiler/execution.jl:324
  [2] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler /scratch/pc2-mitarbeiter/bauerc/.julia/packages/GPUCompiler/I9fZc/src/cache.jl:90
  [3] cufunction(f::CUDA.var"#kernel#363", tt::Type{Tuple{CuDeviceMatrix{Int8, 1}, UInt32, UInt32}}; name::String, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/compiler/execution.jl:297
  [4] macro expansion
    @ /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/compiler/execution.jl:102 [inlined]
  [5] rand!(rng::CUDA.RNG, A::CuArray{Int8, 2, CUDA.Mem.DeviceBuffer})
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/random.jl:60
  [6] rand!(A::CuArray{Int8, 2, CUDA.Mem.DeviceBuffer})
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/random.jl:259
  [7] rand(T::Type, dim1::Int64, dims::Int64)
    @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/random.jl:273
  [8] launch_kernel_wmma_int8()
    @ Main ./REPL[3]:5
  [9] top-level scope
    @ REPL[4]:1
 [10] top-level scope
    @ /scratch/pc2-mitarbeiter/bauerc/devel/GPUInspector.jl/dev/CUDA/src/initialization.jl:52
@carstenbauer carstenbauer added the bug Something isn't working label Mar 15, 2022
@carstenbauer carstenbauer changed the title Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? Mar 15, 2022
@carstenbauer
Copy link
Member Author

(Technically, there is nothing to fix here, of course.)

@maleadt
Copy link
Member

maleadt commented Mar 15, 2022

The unfortunate thing is that there still is no working CUDA.jl release with Int8 WMMA support since 3.8.4 and 3.8.5 besides the new feature (#1119) also contain the culprit 86b5069. I wonder why the unit tests didn't detect this. Maybe we need more tests?

I don't understand what you're saying here. #1119 isn't part of any release yet.

Stopped working with 86b5069.

How?

@maleadt
Copy link
Member

maleadt commented Mar 15, 2022

Since you're pointing to 1bab170, which doesn't make much sense wrt. Int8 WMMA, I take it you're trying to use CUDA.jl from master without using the Manifest. This is not supported. The commit that 'broke' CUDA.jl added a dependency on a specific GPUCompiler commit to the Manifest.

@carstenbauer
Copy link
Member Author

carstenbauer commented Mar 15, 2022

Since you're pointing to 1bab170, which doesn't make much sense wrt. Int8 WMMA, I take it you're trying to use CUDA.jl from master without using the Manifest. This is not supported. The commit that 'broke' CUDA.jl added a dependency on a specific GPUCompiler commit to the Manifest.

Ah, TIL. So my way of checking is flawed. (What I did was checking out particular commits of the repo and then ] deving the local repo in another Julia environment.)

I don't understand what you're saying here. #1119 isn't part of any release yet.

I looked at https://github.com/JuliaGPU/CUDA.jl/releases/tag/v3.8.4 but I see now that the PRs mentioned there didn't necessarily go into the release.

I'll close this then. Sorry for the noise!

@maleadt
Copy link
Member

maleadt commented Mar 15, 2022

I looked at https://github.com/JuliaGPU/CUDA.jl/releases/tag/v3.8.4 but I see now that the PRs mentioned there didn't necessarily go into the release.

Yeah, that's confusing, I know... It's a long-standing issue with TagBot.jl: JuliaRegistries/TagBot#181

Generally I try to stick to released versions of dependencies, but there's been some invasive changes to GPUCompiler.jl I wanted to be live for CUDA.jl#master for a while, so that's why we commit the Manifest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants