Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is this a serious program>? #251

Open
idreamerhx opened this issue Aug 12, 2023 · 3 comments
Open

is this a serious program>? #251

idreamerhx opened this issue Aug 12, 2023 · 3 comments

Comments

@idreamerhx
Copy link

root@1a89b5aa5fce:/opt/hipBLASLt/build/release# ./clients/staging/hipblaslt-bench -m 2048 -n 2048 -k 2048 --precision f32_r -v 1 --activation_type relu
Query device success: there are 1 devices

Device ID 0 : AMD Radeon VII gfx906:sramecc+:xnack-
with 17.2 GB memory, max. SCLK 1801 MHz, max. MCLK 1000 MHz, compute capability 9.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64

rocblaslt warning: No paths matched /opt/hipBLASLt/build/release/library/../Tensile/library/gfx906co. Make sure that HIPBLASLT_TENSILE_LIBPATH is set correctly.
transA,transB,grouped_gemm,batch_count,M,N,K,alpha,lda,stride_a,beta,ldb,stride_b,ldc,stride_c,ldd,stride_d,d_type,compute_type,activation_type,bias_vector,hipblaslt-Gflops,us,CPU-Gflops,CPU-us,norm_error_1
N,N,0,1,2048,2048,2048,1,2048,4194304,0,2048,4194304,2048,4194304,2048,4194304,f32_r,f32_r,relu,0, 2.72763e+06, 6.3,4.47063,3.84376e+06,1.08487
root@1a89b5aa5fce:/opt/hipBLASLt/build/release# ./clients/staging/hipblaslt-bench -m 1024 -n 1024 -k 1024 --precision f32_r -v 1 --activation_type relu
Query device success: there are 1 devices

Device ID 0 : AMD Radeon VII gfx906:sramecc+:xnack-
with 17.2 GB memory, max. SCLK 1801 MHz, max. MCLK 1000 MHz, compute capability 9.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64

rocblaslt warning: No paths matched /opt/hipBLASLt/build/release/library/../Tensile/library/gfx906co. Make sure that HIPBLASLT_TENSILE_LIBPATH is set correctly.
transA,transB,grouped_gemm,batch_count,M,N,K,alpha,lda,stride_a,beta,ldb,stride_b,ldc,stride_c,ldd,stride_d,d_type,compute_type,activation_type,bias_vector,hipblaslt-Gflops,us,CPU-Gflops,CPU-us,norm_error_1
N,N,0,1,1024,1024,1024,1,1024,1048576,0,1024,1048576,1024,1048576,1024,1048576,f32_r,f32_r,relu,0, 279030, 7.7,4.39526,488829,1.12318
root@1a89b5aa5fce:/opt/hipBLASLt/build/release# ^C
root@1a89b5aa5fce:/opt/hipBLASLt/build/release# ./clients/staging/hipblaslt-bench -m 102^C-n 1024 -k 1024 --precision f32_r -v 1 --activation_type relu
root@1a89b5aa5fce:/opt/hipBLASLt/build/release# ./clients/staging/hipblaslt-bench --precision f32_r -v 1
Query device success: there are 1 devices

Device ID 0 : AMD Radeon VII gfx906:sramecc+:xnack-
with 17.2 GB memory, max. SCLK 1801 MHz, max. MCLK 1000 MHz, compute capability 9.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64

rocblaslt warning: No paths matched /opt/hipBLASLt/build/release/library/../Tensile/library/gfx906co. Make sure that HIPBLASLT_TENSILE_LIBPATH is set correctly.
transA,transB,grouped_gemm,batch_count,M,N,K,alpha,lda,stride_a,beta,ldb,stride_b,ldc,stride_c,ldd,stride_d,d_type,compute_type,activation_type,bias_vector,hipblaslt-Gflops,us,CPU-Gflops,CPU-us,norm_error_1
N,N,0,1,128,128,128,1,128,16384,0,128,16384,128,16384,128,16384,f32_r,f32_r,none,0, 776.723, 5.4,4.06425,1032,1.07202

what fuck the gpu has 200Tflops? 279030

@idreamerhx
Copy link
Author

root@1a89b5aa5fce:/opt/hipBLASLt/build/release# ./clients/staging/hipblaslt-test
hipBLASLt version: 300

Query device success: there are 1 devices

Device ID 0 : AMD Radeon VII gfx906:sramecc+:xnack-
with 17.2 GB memory, max. SCLK 1801 MHz, max. MCLK 1000 MHz, compute capability 9.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64

info: parsing of test data may take a couple minutes before any test output appears...

[==========] Running 10091 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 10046 tests from _/matmul_test
[ RUN ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg
[ OK ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg (240 ms)
[ RUN ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t2
[ OK ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t2 (0 ms)
[ RUN ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t3
[ OK ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t3 (0 ms)
[ RUN ] _/matmul_test.matmul/pre_checkin_alpha_beta_zero_NaN_f16_rf16_rf16_rf16_rf32_r_NN_256_128_64_nnan_256_64_nnan_256_256_1

rocblaslt warning: No paths matched /opt/hipBLASLt/build/release/library/../Tensile/library/gfx906co. Make sure that HIPBLASLT_TENSILE_LIBPATH is set correctly.
/opt/hipBLASLt/clients/gtest/../include/unit.hpp:208: Failure
Expected equality of these values:
float(hCPU[i + j * size_t(lda) + k * strideA])
Which is: 0
float(hGPU[i + j * size_t(lda) + k * strideA])
Which is: 0.0050582886
[ FAILED ] _/matmul_test.matmul/pre_checkin_alpha_beta_zero_NaN_f16_rf16_rf16_rf16_rf32_r_NN_256_128_64_nnan_256_64_nnan_256_256_1, where GetParam() = { function: "matmul", name: "alpha_beta_zero_NaN", category: "pre_checkin", known_bug_platforms: "", alpha: -nan, beta: -nan, stride_a: 16384, stride_b: 8192, stride_c: 32768, stride_d: 32768, stride_e: 32768, user_allocated_workspace: 0, M: 256, N: 128, K: 64, lda: 256, ldb: 64, ldc: 256, ldd: 256, lde: 256, batch_count: 1, iters: 10, cold_iters: 2, algo: 0, solution_index: 0, a_type: f16_r, b_type: f16_r, c_type: f16_r, d_type: f16_r, compute_type: f32_r, scale_type: f32_r, initialization: "rand_int", gpu_arch: "", pad: 4096, grouped_gemm: 0, threads: 0, streams: 0, devices: (5836 ms)
[ RUN ] _/matmul_test.matmul/pre_checkin_alpha_beta_zero_NaN_f16_rf16_rf16_rf16_rf32_r_NN_256_128_64_nnan_256_64_2_256_256_1
/opt/hipBLASLt/clients/gtest/../include/unit.hpp:208: Failure
Expected equality of these values:
float(hCPU[i + j * size_t(lda) + k * strideA])

@jichangjichang
Copy link
Collaborator

@idreamerhx hipblaslt currently only support gfx90a device.
https://github.com/ROCmSoftwarePlatform/hipBLASLt/blob/develop/README.md#hardware-requirements

@ppanchad-amd
Copy link

@idreamerhx Can you please test with the latest ROCm 6.1.2 to see if your issue still exists? If not, please close the ticket. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants