Skip to content

Commit

Permalink
[helas] rerun tmad ggttggg inlL
Browse files Browse the repository at this point in the history
./tmad/teeMadX.sh -ggttggg +10x -makeclean -inlLonly
STARTED AT Fri Aug 30 08:08:13 AM CEST 2024
ENDED   AT Fri Aug 30 09:40:38 AM CEST 2024

Note: both CUDA and C++ are 5-15% slower in HELINL=L than in HELINL=0
For CUDA this can be seen both in the madevent test and in the check.exe test

diff -u --color tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inlL_hrd0.txt tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt

(C++ madevent test, 15% slower)
-Executing ' ./build.512y_d_inlL_hrd0/madevent_cpp < /tmp/avalassi/input_ggttggg_x10_cudacpp > /tmp/avalassi/output_ggttggg_x10_cudacpp'
+Executing ' ./build.512y_d_inl0_hrd0/madevent_cpp < /tmp/avalassi/input_ggttggg_x10_cudacpp > /tmp/avalassi/output_ggttggg_x10_cudacpp'
  [OPENMPTH] omp_get_max_threads/nproc = 1/4
  [NGOODHEL] ngoodhel/ncomb = 128/128
  [XSECTION] VECSIZE_USED = 8192
@@ -401,10 +401,10 @@
  [XSECTION] ChannelId = 1
  [XSECTION] Cross section = 2.332e-07 [2.3322993086656014E-007] fbridge_mode=1
  [UNWEIGHT] Wrote 303 events (found 1531 events)
- [COUNTERS] PROGRAM TOTAL          :  325.4847s
- [COUNTERS] Fortran Overhead ( 0 ) :    4.5005s
- [COUNTERS] CudaCpp MEs      ( 2 ) :  320.9382s for    90112 events => throughput is 2.81E+02 events/s
- [COUNTERS] CudaCpp HEL      ( 3 ) :    0.0460s
+ [COUNTERS] PROGRAM TOTAL          :  286.1989s
+ [COUNTERS] Fortran Overhead ( 0 ) :    4.4892s
+ [COUNTERS] CudaCpp MEs      ( 2 ) :  281.6678s for    90112 events => throughput is 3.20E+02 events/s
+ [COUNTERS] CudaCpp HEL      ( 3 ) :    0.0420s

(CUDA madevent test, 10% slower)
-Executing ' ./build.cuda_d_inlL_hrd0/madevent_cuda < /tmp/avalassi/input_ggttggg_x10_cudacpp > /tmp/avalassi/output_ggttggg_x10_cudacpp'
+Executing ' ./build.cuda_d_inl0_hrd0/madevent_cuda < /tmp/avalassi/input_ggttggg_x10_cudacpp > /tmp/avalassi/output_ggttggg_x10_cudacpp'
  [OPENMPTH] omp_get_max_threads/nproc = 1/4
  [NGOODHEL] ngoodhel/ncomb = 128/128
  [XSECTION] VECSIZE_USED = 8192
@@ -557,10 +557,10 @@
  [XSECTION] ChannelId = 1
  [XSECTION] Cross section = 2.332e-07 [2.3322993086656006E-007] fbridge_mode=1
  [UNWEIGHT] Wrote 303 events (found 1531 events)
- [COUNTERS] PROGRAM TOTAL          :   19.6828s
- [COUNTERS] Fortran Overhead ( 0 ) :    4.9752s
- [COUNTERS] CudaCpp MEs      ( 2 ) :   13.4712s for    90112 events => throughput is 6.69E+03 events/s
- [COUNTERS] CudaCpp HEL      ( 3 ) :    1.2365s
+ [COUNTERS] PROGRAM TOTAL          :   17.9918s
+ [COUNTERS] Fortran Overhead ( 0 ) :    4.9757s
+ [COUNTERS] CudaCpp MEs      ( 2 ) :   11.9277s for    90112 events => throughput is 7.55E+03 events/s
+ [COUNTERS] CudaCpp HEL      ( 3 ) :    1.0883s

(CUDA check test with large grid, 5% slower)
 *** EXECUTE GCHECK(MAX) -p 512 32 1 ***
-Process                     = SIGMA_SM_GG_TTXGGG_CUDA [nvcc 12.0.140 (gcc 11.3.1)] [inlineHel=L] [hardcodePARAM=0]
+Process                     = SIGMA_SM_GG_TTXGGG_CUDA [nvcc 12.0.140 (gcc 11.3.1)] [inlineHel=0] [hardcodePARAM=0]
 Workflow summary            = CUD:DBL+THX:CURDEV+RMBDEV+MESDEV/none+NAVBRK
-EvtsPerSec[MECalcOnly] (3a) = ( 9.102842e+03                 )  sec^-1
+EvtsPerSec[MECalcOnly] (3a) = ( 9.584992e+03                 )  sec^-1
  • Loading branch information
valassi committed Aug 30, 2024
1 parent 7e930eb commit f25cd7a
Showing 1 changed file with 81 additions and 78 deletions.
Loading

0 comments on commit f25cd7a

Please sign in to comment.