Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Improve timers (lower overhead using rdtcs) and profile additional fortran components (other than MEs) #962

Draft
wants to merge 107 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
7d01325
[prof] in gg_tt.mad counters.cc, start refactoring of counters - add …
valassi Aug 10, 2024
d43c2f0
[prof] in gg_tt.mad counters.cc driver.f auto_dsig1.f, complete refac…
valassi Aug 10, 2024
5ccf589
[prof] in gg_tt.mad genps.f, add profiling counters to x_to_f_args
valassi Aug 10, 2024
de7d63e
[prof] in gg_tt.mad NNPDFDriver.f add a counter for nnpdf (NB must ma…
valassi Aug 10, 2024
0ef123d
[prof] in gg_tt.mad counters.cc, reimplement counters without maps ag…
valassi Aug 10, 2024
22ce65a
[prof] in gg_tt.mad dsample.f, add time profilers also in sample_put_…
valassi Aug 10, 2024
ce655d0
[prof] in gg_tt.mad counters.cc, rename map_ as array_
valassi Aug 11, 2024
ee6f9f5
[prof] in gg_tt.mad counters.cc add a flag showing if a counter has b…
valassi Aug 11, 2024
feb7a68
[prof] in gg_tt.mad counters.cc, revert the addition of a flag showin…
valassi Aug 11, 2024
0681a76
[prof] in gg_tt.mad counters add an env variable CUDACPP_RUNTIME_DISA…
valassi Aug 11, 2024
de3eac4
[prof] in gg_tt.mad counters, revert the addition of an env variable …
valassi Aug 11, 2024
07e2a93
[prof] in gg_tt.mad counters.cc, improve the error message for counters
valassi Aug 11, 2024
5d4b128
[prof] in gg_tt.mad counters.cc, rename Fortran Overhead as Fortran O…
valassi Aug 11, 2024
6f85197
[prof] in gg_tt.mad, change counter numbers for all counters
valassi Aug 11, 2024
3b798e9
[prof] in gg_tt.mad, add a timer counter for the whole sample_full (e…
valassi Aug 11, 2024
5e4a93f
[prof] in gg_tt.mad, profile the fortran initial i/o: it is now clear…
valassi Aug 11, 2024
14b70ba
[prof] in gg_tt.mad dsample.f, move sample_put_point counters from in…
valassi Aug 11, 2024
d4bb207
[prof] in gg_tt.mad, profile prepare_grouping_choice and select_group…
valassi Aug 11, 2024
d035967
[prof] in gg_tt.mad, profile UPDATE_SCALE_COUPLING_VEC (as "test13" f…
valassi Aug 11, 2024
08b25a6
[prof] in gg_tt.mad, profile UNWGT (as "test16" for the moment, wip)
valassi Aug 11, 2024
cba16ec
[prof] in gg_tt.mad, move x_to_f profiling from genps.f to dsample.f
valassi Aug 12, 2024
968dd22
[prof] in gg_tt.mad, move all COUNTERS_REGISTER_COUNTER calls to driv…
valassi Aug 12, 2024
f94794d
[prof] in gg_tt.mad, move PDF counters from NNPDFDriver.f to auto_dsi…
valassi Aug 12, 2024
ef8cff8
[prof] in gg_tt.mad, profile REWGT (as "test14" for the moment, wip)
valassi Aug 12, 2024
e073613
[prof] in gg_tt.mad, add a second "program initial_i/o" counter"
valassi Aug 12, 2024
fbd5322
[prof] in gg_tt.mad driver.f, clean up comments in counters_register …
valassi Aug 12, 2024
04e39de
[prof] in gg_tt.mad driver.f, rename timers for unwgt, rewgt, scale, …
valassi Aug 12, 2024
62d7c4e
[prof] in gg_tt.mad dsample.f, remove the timer for grouping function…
valassi Aug 12, 2024
d3165cb
[prof] in gg_tt.mad auto_dsig1.f, add profiling for matrix1 also in d…
valassi Aug 12, 2024
d474e21
[prof] in gg_tt.mad, revert the profiling for matrix1 in dsig1
valassi Aug 12, 2024
59dbf04
[prof] in gg_tt.mad, profile ranmar (in ranmar.f: but this causes dou…
valassi Aug 12, 2024
117bd1e
[prof] in gg_tt.mad, revert the profiling of ranmar
valassi Aug 12, 2024
c356280
[prof] in gg_tt.mad driver.f, profile bridge creation/deletion (as te…
valassi Aug 12, 2024
6f86051
[prof] in gg_tt.mad, cleanly define Cudacpp initialise (bridge creati…
valassi Aug 12, 2024
255c343
[prof] in gg_tt.mad, start cleaning up timers: remove the two PROGRAM…
valassi Aug 12, 2024
568e024
[prof] in gg_tt.mad, complete cleanup of timers, with better names an…
valassi Aug 12, 2024
e1e212e
[prof] in gg_tt.mad counters.cc, add "OVERALL MEs" and "OVERALL NON-M…
valassi Aug 12, 2024
c80aa78
[prof] in gg_tt.mad counters add again an env variable CUDACPP_RUNTIM…
valassi Aug 12, 2024
5b24462
[prof] in gg_tt.mad counters.cc, consider printing throughputs only f…
valassi Aug 12, 2024
c9a72f3
[prof] in gg_tt.mad counters.cc, revert the last change
valassi Aug 12, 2024
c330fb1
[prof] in gg_tt.mad counters.cc, fix clang format
valassi Aug 12, 2024
555d91f
[prof] regenerate CODEGEN patch from gg_tt.mad including additional c…
valassi Aug 12, 2024
56404b3
[prof] regenerate all processes
valassi Aug 12, 2024
5a2f534
[prof] rerun 102 tput tests on itscrd90 - all ok
valassi Aug 13, 2024
82f87c2
[prof] rerun 30 tmad tests on itscrd90 WITH NEW COUNTERS - all as exp…
valassi Aug 13, 2024
93cf80e
[prof] in gg_tt.mad, profile gen_mom (13) and sample_get_discrete_x (…
valassi Aug 14, 2024
f77cd1f
[prof] in gg_tt.mad, profile also subsections of genmom... is there a…
valassi Aug 14, 2024
20178c7
[prof] in gg_tt.mad, revert the last two commits (remove test profili…
valassi Aug 19, 2024
17aeb61
[prof] go back to previous tput and tmad logs for easier merging of c…
valassi Aug 19, 2024
2af35cb
[cmsdyps/prof] in gg_tt.mad, backport changes from pp_dy3j.mad (P0_gu…
valassi Aug 17, 2024
0f65d33
[cmsdyps/prof] rerun one tput test for ggtt with the new timers, chec…
valassi Aug 17, 2024
83202ca
[cmsdyps/prof] in gg_tt.mad timermap.h, move to using rdtsc timers by…
valassi Aug 17, 2024
c077f83
[cmsdyps/prof] in tput/throughputX.sh, add a printout about chrono vs…
valassi Aug 17, 2024
88f6916
[cmsdyps/prof] rerun one tput test for ggtt with chrono timers, no ch…
valassi Aug 17, 2024
d10e7f4
[cmsdyps/prof] rerun one tput test for ggtt with rdtsc timers, essent…
valassi Aug 17, 2024
90c863b
[cmsdyps/prof] in gg_tt.mad, backport latest changes in timers and co…
valassi Aug 19, 2024
609b4e4
[cmsdyps/prof] rerun one tput test for ggtt with new chrono timers - …
valassi Aug 19, 2024
d06e6a4
[cmsdyps/prof] rerun one tput test for ggtt with new rdtsc timers - t…
valassi Aug 19, 2024
48c8c79
[cmsdyps/prof] in gg_tt.mad timermap.h and check_sa,cc, fix the calib…
valassi Aug 19, 2024
9bf5e6e
[cmsdyps/prof] rerun one tput test for ggtt with new chrono timers - …
valassi Aug 19, 2024
a1c9b7a
[cmsdyps/prof] rerun one tput test for ggtt with new rdtsc timers - n…
valassi Aug 19, 2024
5fe76e0
[prof] in CODEGEN, backport the latest changes to timermap.h, check_s…
valassi Aug 19, 2024
3435f56
[prof] in CODEGEN, fix clang format for timermap.h, check_sa.cpp, tim…
valassi Aug 19, 2024
6f7076a
[prof] regenerate CODEGEN patch from gg_tt.mad including htuple comme…
valassi Aug 19, 2024
0db0718
[prof] in gg_tt.mad, fix clang format for timermap.h, check_sa.cpp, t…
valassi Aug 19, 2024
e2b46f2
[prof] regenerate gg_tt.mad, all ok
valassi Aug 19, 2024
5d75bb4
[prof] regenerate all processes
valassi Aug 19, 2024
6eb36a6
[prof] rerun a simple tmad test for ggtt... times look ok but through…
valassi Aug 19, 2024
4e7e07c
[prof] in gg_tt.mad and CODEGEN, fix a silly bug in throughputs (was …
valassi Aug 19, 2024
2e43faf
[prof] revert tmad run of ggtt with throughput bug
valassi Aug 19, 2024
42cad8d
[prof] rerun again a simple tmad test for ggtt... now times and throu…
valassi Aug 19, 2024
607abfc
[prof] regenerate gg_tt.mad, all ok
valassi Aug 19, 2024
9a03440
[prof] manually fix counters.cc in all generated processes
valassi Aug 19, 2024
f0a7a3a
[prof] rerun 102 tput tests (with new rdtcs timers) on itscrd90 - all ok
valassi Aug 20, 2024
db32587
[prof] ** COMPLETE PROF ** rerun 30 tmad tests on itscrd90 (with new …
valassi Aug 20, 2024
95329f3
[prof] move to upstream/master codegen logs to ease merging
valassi Aug 21, 2024
9b394e6
Merge remote-tracking branch 'upstream/master' (with hel #960, mac #9…
valassi Aug 21, 2024
56d73ff
[prof] regenerate all processes after merging upstream/master
valassi Aug 21, 2024
9ac0039
[prof] in gg_tt.mad and CODEGEN timers/counters, disable Rdtsc counte…
valassi Aug 21, 2024
5c8d579
[prof] regenerate all processes after disabling Rdtsc counters on pla…
valassi Aug 21, 2024
c60de03
[prof] in CODEGEN/generateAndCompare.sh, add gux_taptamggux (similar …
valassi Aug 23, 2024
3a94376
[prof] add gux_taptamggux.mad to CODEGEN/allGenerateAndCompare.sh
valassi Aug 23, 2024
af682f3
[prof] add gux_taptamggux.mad to the repo, for timer tests
valassi Aug 23, 2024
5f22187
[prof] in gux_taptamggux.mad, switch on SampleGetX profiling as a test
valassi Aug 23, 2024
5c0a2ed
[prof] in gux_taptamggux.mad timer.h, add the option to remove overhe…
valassi Aug 23, 2024
ad9b747
[prof] in gux_taptamggux.mad timer.h, add instead a getTotalOverheadS…
valassi Aug 23, 2024
464b9d7
[prof] in gux_taptamggux.mad counters.cc, add the option to remove ti…
valassi Aug 23, 2024
e33250a
[prof] in gux_taptamggux.mad counters.cc, improve handling of TEST CO…
valassi Aug 23, 2024
eba8039
[prof] in gux_taptamggux.mad counters.cc, add a mechanism for declari…
valassi Aug 23, 2024
51bbbaa
[prof] in gux_taptamggux.mad counters.cc, add a printout of the estim…
valassi Aug 23, 2024
fe44fa9
[prof] in gux_taptamggux.mad, declare SampleGetX as included in Phase…
valassi Aug 23, 2024
5d3da5a
[prof] in gux_taptamggux.mad timer.h, remove all handling of overhead…
valassi Aug 23, 2024
3577a55
[prof] in gux_taptamggux.mad counters.h, move here the handling of co…
valassi Aug 23, 2024
6dcab81
[prof] in gux_taptamggux.mad counters.h, improve the handling of coun…
valassi Aug 23, 2024
ef82161
[prof] move to CODEGEN logs from the latest upstream/master for easie…
valassi Sep 2, 2024
eb7e826
Merge remote-tracking branch 'upstream/master' (including new CI and …
valassi Sep 2, 2024
2525410
[prof] move to tput/tmad logs from the latest upstream/master for eas…
valassi Sep 16, 2024
34041b7
[prof] move to auto_dsig1.f from the latest upstream/master in all ge…
valassi Sep 16, 2024
4df3dfa
Merge remote-tracking branch 'upstream/master' (including june24, goo…
valassi Sep 16, 2024
4d91140
[prof] in gg_tt.mad auto_dsig1.f, add back all counters as in the pro…
valassi Sep 16, 2024
4526aac
[prof] regenerate CODEGEN patch from gg_tt.mad after merging an old u…
valassi Oct 4, 2024
5eacb46
[prof] regenerate all processes after merging an old 'upstream/master…
valassi Oct 4, 2024
95a9070
[prof] move to the latest upstream/master CODEGEN logs for easier mer…
valassi Oct 4, 2024
7c6ba3d
[prof] move to dsample.f from the latest upstream/master in all gener…
valassi Oct 4, 2024
416a52b
Merge remote-tracking branch 'upstream/master' (including v1.0.0 and …
valassi Oct 4, 2024
3ec0964
[prof] regenerate CODEGEN patch from gg_tt.mad after merging upstream…
valassi Oct 4, 2024
817dd25
[prof] regenerate all processes after merging upstream/master(v1.0.0 …
valassi Oct 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
Original file line number Diff line number Diff line change
@@ -1,8 +1,141 @@
diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig.f a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig.f
index bc9bcfeb9..0c1962d3e 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig.f
+++ a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig.f
@@ -315,8 +315,10 @@ C ENDDO

C set the running scale
C and update the couplings accordingly
+ CALL COUNTERS_START_COUNTER( 5, VECSIZE_USED ) ! FortranUpdateScaleCouplings=5
CALL UPDATE_SCALE_COUPLING_VEC(ALL_P, ALL_WGT, ALL_Q2FACT,
$ VECSIZE_USED)
+ CALL COUNTERS_STOP_COUNTER( 5 ) ! FortranUpdateScaleCouplings=5

IF(GROUPED_MC_GRID_STATUS.EQ.0) THEN
C If we were in the initialization phase of the grid for MC over
diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f
index db3c284ca..f1cd4e976 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f
+++ a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f
@@ -127,6 +127,7 @@ C Continue only if IMODE is 0, 4 or 5
IF(IMODE.NE.0.AND.IMODE.NE.4.AND.IMODE.NE.5) RETURN


+ CALL COUNTERS_START_COUNTER( 4, 1 ) ! FortranPDFs=4
IF (ABS(LPP(IB(1))).GE.1) THEN
C LP=SIGN(1,LPP(IB(1)))
IF (DSQRT(Q2FACT(IB(1))).EQ.0D0) THEN
@@ -148,6 +149,7 @@ C LP=SIGN(1,LPP(IB(2)))
ENDIF
G2=PDG2PDF(LPP(IB(2)),0, IB(2),XBK(IB(2)), QSCALE)
ENDIF
+ CALL COUNTERS_STOP_COUNTER( 4 ) ! FortranPDFs=2
PD(0) = 0D0
IPROC = 0
IPROC=IPROC+1 ! g g > t t~
@@ -186,7 +188,9 @@ C Select a flavor combination (need to do here for right sign)
R=R-DABS(PD(IPSEL))/PD(0)
ENDDO

+ CALL COUNTERS_START_COUNTER( 6, 1 ) ! FortranReweight=6
DSIGUU=DSIGUU*REWGT(PP,1)
+ CALL COUNTERS_STOP_COUNTER( 6 ) ! FortranReweight=6

C Apply the bias weight specified in the run card (default is 1.0)
DSIGUU=DSIGUU*CUSTOM_BIAS(PP,DSIGUU,1,1)
@@ -360,6 +364,7 @@ C Continue only if IMODE is 0, 4 or 5
STOP
ENDIF

+ CALL COUNTERS_START_COUNTER( 4, VECSIZE_USED ) ! FortranPDFs=2
DO CURR_WARP=1, NB_WARP_USED
IF(IMIRROR_VEC(CURR_WARP).EQ.1)THEN
IB(1) = 1
@@ -382,6 +387,7 @@ C LP=SIGN(1,LPP(IB(2)))
ENDIF
ENDDO ! IWARP LOOP
ENDDO ! CURRWARP LOOP
+ CALL COUNTERS_STOP_COUNTER( 4 ) ! FortranPDFs=2
ALL_PD(0,:) = 0D0
IPROC = 0
IPROC=IPROC+1 ! g g > t t~
@@ -426,7 +432,9 @@ C Select a flavor combination (need to do here for right sign)
CHANNEL = SUBDIAG(1)


+ CALL COUNTERS_START_COUNTER( 6, 1 ) ! FortranReweight=6
ALL_RWGT(IVEC) = REWGT(ALL_PP(0,1,IVEC), IVEC)
+ CALL COUNTERS_STOP_COUNTER( 6 ) ! FortranReweight=6

IF(FRAME_ID.NE.6)THEN
CALL BOOST_TO_FRAME(ALL_PP(0,1,IVEC), FRAME_ID, P_MULTI(0
@@ -482,11 +490,13 @@ C Set sign of dsig based on sign of PDF and matrix element
ALL_OUT(IVEC)=0D0
ENDIF
C Generate events only if IMODE is 0.
+ CALL COUNTERS_START_COUNTER( 7, 1 ) ! FortranUnweight=7
IF(IMODE.EQ.0.AND.DABS(ALL_OUT(IVEC)).GT.0D0)THEN
C Call UNWGT to unweight and store events
CALL UNWGT(ALL_PP(0,1,IVEC), ALL_OUT(IVEC)*ALL_WGT(IVEC),1,
$ SELECTED_HEL(IVEC), SELECTED_COL(IVEC), IVEC)
ENDIF
+ CALL COUNTERS_STOP_COUNTER( 7 ) ! FortranUnweight=7
ENDDO

END
@@ -555,7 +565,7 @@ C Call UNWGT to unweight and store events

IF( FBRIDGE_MODE .LE. 0 ) THEN ! (FortranOnly=0 or BothQuiet=-1 or BothDebug=-2)
#endif
- CALL COUNTERS_SMATRIX1MULTI_START( -1, VECSIZE_USED ) ! fortranMEs=-1
+ CALL COUNTERS_START_COUNTER( 9, VECSIZE_USED ) ! FortranMEs=9
DO IVEC=1, VECSIZE_USED
CALL SMATRIX1(P_MULTI(0,1,IVEC),
& hel_rand(IVEC),
@@ -571,7 +581,7 @@ C ======================================================
C *START* Included from CUDACPP template smatrix_multi.f
C (into function smatrix$i_multi in auto_dsig$i.f)
C ======================================================
- CALL COUNTERS_SMATRIX1MULTI_STOP( -1 ) ! fortranMEs=-1
+ CALL COUNTERS_STOP_COUNTER( 9 ) ! FortranMEs=9
#ifdef MG5AMC_MEEXPORTER_CUDACPP
ENDIF

@@ -581,7 +591,7 @@ C ======================================================
STOP
ENDIF
IF ( FIRST ) THEN ! exclude first pass (helicity filtering) from timers (#461)
- CALL COUNTERS_SMATRIX1MULTI_START( 1, VECSIZE_USED ) ! cudacppHEL=1
+ CALL COUNTERS_START_COUNTER( 11, 0 ) ! 11=CudaCpp-Initialise (was CudaCpp-HEL; counter set to 1 on bridge creation, do not increment it further)
CALL FBRIDGESEQUENCE_NOMULTICHANNEL( FBRIDGE_PBRIDGE, ! multi channel disabled for helicity filtering
& P_MULTI, ALL_G, HEL_RAND, COL_RAND, OUT2,
& SELECTED_HEL2, SELECTED_COL2, .TRUE.) ! quit after computing helicities
@@ -602,9 +612,9 @@ C ENDIF
ENDIF
WRITE (6,*) 'NGOODHEL =', NGOODHEL
WRITE (6,*) 'NCOMB =', NCOMB
- CALL COUNTERS_SMATRIX1MULTI_STOP( 1 ) ! cudacppHEL=1
+ CALL COUNTERS_STOP_COUNTER( 11 ) ! 11=CudaCpp-Initialise (was CudaCpp-HEL)
ENDIF
- CALL COUNTERS_SMATRIX1MULTI_START( 0, VECSIZE_USED ) ! cudacppMEs=0
+ CALL COUNTERS_START_COUNTER( 19, VECSIZE_USED ) ! CudaCppMEs=19
IF ( .NOT. MULTI_CHANNEL ) THEN
CALL FBRIDGESEQUENCE_NOMULTICHANNEL( FBRIDGE_PBRIDGE, ! multi channel disabled
& P_MULTI, ALL_G, HEL_RAND, COL_RAND, OUT2,
@@ -618,7 +628,7 @@ C ENDIF
& HEL_RAND, COL_RAND, CHANNELS, OUT2,
& SELECTED_HEL2, SELECTED_COL2, .FALSE.) ! do not quit after computing helicities
ENDIF
- CALL COUNTERS_SMATRIX1MULTI_STOP( 0 ) ! cudacppMEs=0
+ CALL COUNTERS_STOP_COUNTER( 19 ) ! CudaCppMEs=19
ENDIF

IF( FBRIDGE_MODE .LT. 0 ) THEN ! (BothQuiet=-1 or BothDebug=-2)
diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f
index 1124a9164..27a6e4674 100644
index ecd11b239..4650934b2 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f
+++ a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f
@@ -74,13 +74,77 @@ c common/to_colstats/ncols,ncolflow,ncolalt,ic
@@ -76,16 +76,95 @@ c common/to_colstats/ncols,ncolflow,ncolalt,ic

include 'coupl.inc' ! needs VECSIZE_MEMMAX (defined in vector.inc)
INTEGER VECSIZE_USED
Expand All @@ -27,7 +160,19 @@ index 1124a9164..27a6e4674 100644
+ CALL OMPNUMTHREADS_NOT_SET_MEANS_ONE_THREAD()
+#endif
+ CALL COUNTERS_INITIALISE()
+
+c Use null-terminated C-string in COUNTERS_REGISTER_COUNTER calls (maybe it is not needed, but it does not harm)
+ CALL COUNTERS_REGISTER_COUNTER( 1, 'Fortran Initialise(I/O)'//char(0) )
+ CALL COUNTERS_REGISTER_COUNTER( 3, 'Fortran PhaseSpaceSampling'//char(0) ) ! uniform [0,1] + vegas to [0,1] + map to momenta
+ CALL COUNTERS_REGISTER_COUNTER( 4, 'Fortran PDFs'//char(0) )
+ CALL COUNTERS_REGISTER_COUNTER( 5, 'Fortran UpdateScaleCouplings'//char(0) )
+ CALL COUNTERS_REGISTER_COUNTER( 6, 'Fortran Reweight'//char(0) )
+ CALL COUNTERS_REGISTER_COUNTER( 7, 'Fortran Unweight(LHE-I/O)'//char(0) )
+ CALL COUNTERS_REGISTER_COUNTER( 8, 'Fortran SamplePutPoint'//char(0) )
+ CALL COUNTERS_REGISTER_COUNTER( 9, 'Fortran MEs'//char(0) )
+ CALL COUNTERS_REGISTER_COUNTER( 11, 'CudaCpp Initialise'//char(0) )
+ CALL COUNTERS_REGISTER_COUNTER( 12, 'CudaCpp Finalise'//char(0) )
+ CALL COUNTERS_REGISTER_COUNTER( 19, 'CudaCpp MEs'//char(0) )
+c CALL COUNTERS_REGISTER_COUNTER( 21, 'TEST SampleGetX'//char(0) )
+#ifdef MG5AMC_MEEXPORTER_CUDACPP
+ fbridge_mode = 1 ! CppOnly=1, default for CUDACPP
+#else
Expand Down Expand Up @@ -71,24 +216,39 @@ index 1124a9164..27a6e4674 100644
+ endif
+
+#ifdef MG5AMC_MEEXPORTER_CUDACPP
+ CALL COUNTERS_START_COUNTER( 11, 1 ) ! 11=CudaCpp-Initialise
+ CALL FBRIDGECREATE(FBRIDGE_PBRIDGE, VECSIZE_USED, NEXTERNAL, 4) ! this must be at the beginning as it initialises the CUDA device
+ FBRIDGE_NCBYF1 = 0
+ FBRIDGE_CBYF1SUM = 0
+ FBRIDGE_CBYF1SUM2 = 0
+ FBRIDGE_CBYF1MAX = -1D100
+ FBRIDGE_CBYF1MIN = 1D100
+ CALL COUNTERS_STOP_COUNTER( 11 ) ! 11=CudaCpp-Initialise
+#endif
c
c Read process number
c
@@ -208,8 +272,33 @@ c call sample_result(xsec,xerr)
+ CALL COUNTERS_START_COUNTER( 1, 1 ) ! FortranInitialise=1
call open_file(lun+1, 'dname.mg', fopened)
if (.not.fopened)then
goto 11
@@ -156,6 +235,7 @@ c If CKKW-type matching, read IS Sudakov grid
print *,'Running CKKW as lower mult sample'
endif
endif
+ CALL COUNTERS_STOP_COUNTER( 1 ) ! FortranInitialise=1

c
c Get user input
@@ -216,8 +296,35 @@ c call sample_result(xsec,xerr)
c write(*,*) 'Final xsec: ',xsec

rewind(lun)
-
close(lun)
+
+#ifdef MG5AMC_MEEXPORTER_CUDACPP
+ CALL COUNTERS_START_COUNTER( 12, 1 ) ! 12=CudaCpp-Finalise
+ CALL FBRIDGEDELETE(FBRIDGE_PBRIDGE) ! this must be at the end as it shuts down the CUDA device
+ IF( FBRIDGE_MODE .LE. -1 ) THEN ! (BothQuiet=-1 or BothDebug=-2)
+ WRITE(*,'(a,f10.8,a,e8.2)')
Expand All @@ -111,12 +271,13 @@ index 1124a9164..27a6e4674 100644
+ & FBRIDGE_CBYF1SUM / FBRIDGE_NCBYF1, ' +- ',
+ & SQRT( FBRIDGE_CBYF1SUM2 ) / FBRIDGE_NCBYF1 ! ~standard error
+ ENDIF
+ CALL COUNTERS_STOP_COUNTER( 12 ) ! 12=CudaCpp-Finalise
+#endif
+ CALL COUNTERS_FINALISE()
end

c $B$ get_user_params $B$ ! tag for MadWeight
@@ -387,7 +476,7 @@ c
@@ -400,7 +507,7 @@ c
fopened=.false.
tempname=filename
fine=index(tempname,' ')
Expand All @@ -126,7 +287,7 @@ index 1124a9164..27a6e4674 100644
open(unit=lun,file=tempname,status='old',ERR=20)
fopened=.true.
diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f
index 1acba8200..069c74ef4 100644
index bf488e4b0..707ea4032 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f
+++ a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f
@@ -71,7 +71,10 @@ C
Expand All @@ -141,7 +302,7 @@ index 1acba8200..069c74ef4 100644
C
C This is just to temporarily store the reference grid for
C helicity of the DiscreteSampler so as to obtain its number of
@@ -211,6 +214,17 @@ C ----------
@@ -224,6 +227,17 @@ C update.
ENDIF
IF(NTRY(1).EQ.MAXTRIES)THEN
ISHEL=MIN(ISUM_HEL,NGOOD)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,63 @@
diff --git b/epochX/cudacpp/gg_tt.mad/Source/dsample.f a/epochX/cudacpp/gg_tt.mad/Source/dsample.f
index bcfe1138b..53eec814a 100644
--- b/epochX/cudacpp/gg_tt.mad/Source/dsample.f
+++ a/epochX/cudacpp/gg_tt.mad/Source/dsample.f
@@ -175,7 +175,9 @@ c
if (iter .le. itmax) then
c write(*,*) 'iter/ievent/ivec', iter, ievent, ivec
ievent=ievent+1
+ CALL COUNTERS_START_COUNTER( 3, 1 ) ! FortranRandom2Momenta=3
call x_to_f_arg(ndim,ipole,mincfig,maxcfig,ninvar,wgt,x,p)
+ CALL COUNTERS_STOP_COUNTER( 3 ) ! FortranRandom2Momenta=3
CUTSDONE=.FALSE.
CUTSPASSED=.FALSE.
if (passcuts(p,VECSIZE_USED)) then
@@ -247,6 +249,7 @@ c write(*,*) i, all_wgt(i), fx, all_wgt(i)*fx
do I=1, VECSIZE_USED
all_wgt(i) = all_wgt(i)*all_fx(i)
enddo
+ CALL COUNTERS_START_COUNTER( 8, VECSIZE_USED ) ! FortranSamplePutPoint=8
do i =1, VECSIZE_USED
c if last paremeter is true -> allow grid update so only for a full page
lastbin(:) = all_lastbin(:,i)
@@ -254,6 +257,7 @@ c if last paremeter is true -> allow grid update so only for a full page
c write(*,*) 'put point in sample kevent', kevent, 'allow_update', ivec.eq.VECSIZE_USED
call sample_put_point(all_wgt(i),all_x(1,i),iter,ipole, i.eq.VECSIZE_USED) !Store result
enddo
+ CALL COUNTERS_STOP_COUNTER( 8 ) ! FortranSamplePutPoint=8
if (VECSIZE_USED.ne.1.and.force_reset)then
call reset_cumulative_variable()
force_reset=.false.
@@ -264,7 +268,9 @@ c if (wgt .ne. 0d0) call graph_point(p,wgt) !Update graphs
else
fx =0d0
wgt=0d0
+ CALL COUNTERS_START_COUNTER( 8, 1 ) ! FortranSamplePutPoint=8
call sample_put_point(wgt,x(1),iter,ipole,.true.) !Store result
+ CALL COUNTERS_STOP_COUNTER( 8 ) ! FortranSamplePutPoint=8
endif

endif
@@ -429,7 +435,9 @@ c
call sample_get_config(wgt,iter,ipole)
if (iter .le. itmax) then
ievent=ievent+1
+ CALL COUNTERS_START_COUNTER( 3, 1 ) ! FortranRandom2Momenta=3
call x_to_f_arg(ndim,ipole,mincfig,maxcfig,ninvar,wgt,x,p)
+ CALL COUNTERS_STOP_COUNTER( 3 ) ! FortranRandom2Momenta=3
if (pass_point(p)) then
xzoomfact = 1d0
fx = dsig(p,wgt,0) !Evaluate function
@@ -445,7 +453,9 @@ c
endif

if (nzoom .le. 0) then
+ CALL COUNTERS_START_COUNTER( 8, 1 ) ! FortranSamplePutPoint=8
call sample_put_point(wgt,x(1),iter,ipole,.true.) !Store result
+ CALL COUNTERS_STOP_COUNTER( 8 ) ! FortranSamplePutPoint=8
else
nzoom = nzoom -1
ievent=ievent-1
diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/makefile a/epochX/cudacpp/gg_tt.mad/SubProcesses/makefile
index 348c283be..49e6800ff 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/makefile
Expand Down
Loading
Loading