Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
./tmad/teeMadX.sh -ggttggg +10x -makeclean -inlLonly STARTED AT Fri Aug 30 08:08:13 AM CEST 2024 ENDED AT Fri Aug 30 09:40:38 AM CEST 2024 Note: both CUDA and C++ are 5-15% slower in HELINL=L than in HELINL=0 For CUDA this can be seen both in the madevent test and in the check.exe test diff -u --color tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inlL_hrd0.txt tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt (C++ madevent test, 15% slower) -Executing ' ./build.512y_d_inlL_hrd0/madevent_cpp < /tmp/avalassi/input_ggttggg_x10_cudacpp > /tmp/avalassi/output_ggttggg_x10_cudacpp' +Executing ' ./build.512y_d_inl0_hrd0/madevent_cpp < /tmp/avalassi/input_ggttggg_x10_cudacpp > /tmp/avalassi/output_ggttggg_x10_cudacpp' [OPENMPTH] omp_get_max_threads/nproc = 1/4 [NGOODHEL] ngoodhel/ncomb = 128/128 [XSECTION] VECSIZE_USED = 8192 @@ -401,10 +401,10 @@ [XSECTION] ChannelId = 1 [XSECTION] Cross section = 2.332e-07 [2.3322993086656014E-007] fbridge_mode=1 [UNWEIGHT] Wrote 303 events (found 1531 events) - [COUNTERS] PROGRAM TOTAL : 325.4847s - [COUNTERS] Fortran Overhead ( 0 ) : 4.5005s - [COUNTERS] CudaCpp MEs ( 2 ) : 320.9382s for 90112 events => throughput is 2.81E+02 events/s - [COUNTERS] CudaCpp HEL ( 3 ) : 0.0460s + [COUNTERS] PROGRAM TOTAL : 286.1989s + [COUNTERS] Fortran Overhead ( 0 ) : 4.4892s + [COUNTERS] CudaCpp MEs ( 2 ) : 281.6678s for 90112 events => throughput is 3.20E+02 events/s + [COUNTERS] CudaCpp HEL ( 3 ) : 0.0420s (CUDA madevent test, 10% slower) -Executing ' ./build.cuda_d_inlL_hrd0/madevent_cuda < /tmp/avalassi/input_ggttggg_x10_cudacpp > /tmp/avalassi/output_ggttggg_x10_cudacpp' +Executing ' ./build.cuda_d_inl0_hrd0/madevent_cuda < /tmp/avalassi/input_ggttggg_x10_cudacpp > /tmp/avalassi/output_ggttggg_x10_cudacpp' [OPENMPTH] omp_get_max_threads/nproc = 1/4 [NGOODHEL] ngoodhel/ncomb = 128/128 [XSECTION] VECSIZE_USED = 8192 @@ -557,10 +557,10 @@ [XSECTION] ChannelId = 1 [XSECTION] Cross section = 2.332e-07 [2.3322993086656006E-007] fbridge_mode=1 [UNWEIGHT] Wrote 303 events (found 1531 events) - [COUNTERS] PROGRAM TOTAL : 19.6828s - [COUNTERS] Fortran Overhead ( 0 ) : 4.9752s - [COUNTERS] CudaCpp MEs ( 2 ) : 13.4712s for 90112 events => throughput is 6.69E+03 events/s - [COUNTERS] CudaCpp HEL ( 3 ) : 1.2365s + [COUNTERS] PROGRAM TOTAL : 17.9918s + [COUNTERS] Fortran Overhead ( 0 ) : 4.9757s + [COUNTERS] CudaCpp MEs ( 2 ) : 11.9277s for 90112 events => throughput is 7.55E+03 events/s + [COUNTERS] CudaCpp HEL ( 3 ) : 1.0883s (CUDA check test with large grid, 5% slower) *** EXECUTE GCHECK(MAX) -p 512 32 1 *** -Process = SIGMA_SM_GG_TTXGGG_CUDA [nvcc 12.0.140 (gcc 11.3.1)] [inlineHel=L] [hardcodePARAM=0] +Process = SIGMA_SM_GG_TTXGGG_CUDA [nvcc 12.0.140 (gcc 11.3.1)] [inlineHel=0] [hardcodePARAM=0] Workflow summary = CUD:DBL+THX:CURDEV+RMBDEV+MESDEV/none+NAVBRK -EvtsPerSec[MECalcOnly] (3a) = ( 9.102842e+03 ) sec^-1 +EvtsPerSec[MECalcOnly] (3a) = ( 9.584992e+03 ) sec^-1
- Loading branch information