Skip to content

Commit

Permalink
[amd] regenerate all processes, including OPTFLAGS=-O2 for hipcc inst…
Browse files Browse the repository at this point in the history
…ead of -O3 (workaround for gq_ttq crash madgraph5#806)
  • Loading branch information
valassi committed Sep 19, 2024
1 parent 3c2792a commit f91c156
Show file tree
Hide file tree
Showing 46 changed files with 188 additions and 166 deletions.
18 changes: 9 additions & 9 deletions epochX/cudacpp/ee_mumu.mad/CODEGEN_mad_ee_mumu_log.txt
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ generate e+ e- > mu+ mu-
No model currently active, so we import the Standard Model
INFO: load particles
INFO: load vertices
DEBUG: model prefixing takes 0.005692958831787109 
DEBUG: model prefixing takes 0.0057468414306640625 
INFO: Restrict model sm with file models/sm/restrict_default.dat .
DEBUG: Simplifying conditional expressions 
DEBUG: remove interactions: u s w+ at order: QED=1 
Expand Down Expand Up @@ -149,7 +149,7 @@ INFO: Checking for minimal orders which gives processes.
INFO: Please specify coupling orders to bypass this step.
INFO: Trying process: e+ e- > mu+ mu- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
1 processes with 2 diagrams generated in 0.005 s
1 processes with 2 diagrams generated in 0.004 s
Total: 1 processes with 2 diagrams
output madevent_simd ../TMPOUT/CODEGEN_mad_ee_mumu --hel_recycling=False --vector_size=32
Load PLUGIN.CUDACPP_OUTPUT
Expand Down Expand Up @@ -182,19 +182,19 @@ INFO: Finding symmetric diagrams for subprocess group epem_mupmum
DEBUG: iconfig_to_diag =  {1: 1, 2: 2} [model_handling.py at line 1547] 
DEBUG: diag_to_iconfig =  {1: 1, 2: 2} [model_handling.py at line 1548] 
Generated helas calls for 1 subprocesses (2 diagrams) in 0.004 s
Wrote files for 8 helas calls in 0.072 s
Wrote files for 8 helas calls in 0.071 s
DEBUG: self.vector_size =  32 [export_v4.py at line 7023] 
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates FFV2 routines
ALOHA: aloha creates FFV4 routines
ALOHA: aloha creates 3 routines in 0.205 s
ALOHA: aloha creates 3 routines in 0.204 s
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates FFV2 routines
ALOHA: aloha creates FFV4 routines
ALOHA: aloha creates FFV2_4 routines
ALOHA: aloha creates 7 routines in 0.260 s
ALOHA: aloha creates 7 routines in 0.261 s
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV2
Expand Down Expand Up @@ -234,10 +234,10 @@ Type "launch" to generate events from this process, or see
Run "open index.html" to see more information about this process.
quit

real 0m3.845s
user 0m1.829s
sys 0m0.251s
Code generation completed in 4 seconds
real 0m2.993s
user 0m1.803s
sys 0m0.263s
Code generation completed in 3 seconds
************************************************************
* *
* W E L C O M E to *
Expand Down
1 change: 1 addition & 0 deletions epochX/cudacpp/ee_mumu.mad/SubProcesses/cudacpp.mk
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ else ifeq ($(BACKEND),hip)
GPUSUFFIX = hip

# Optimization flags
override OPTFLAGS = -O2 # work around "Memory access fault" in gq_ttq for HIP #806: disable hipcc -O3 optimizations
GPUFLAGS = $(foreach opt, $(OPTFLAGS), $(XCOMPILERFLAG) $(opt))

# DEBUG FLAGS (for #806: see https://hackmd.io/@gmarkoma/lumi_finland)
Expand Down
12 changes: 6 additions & 6 deletions epochX/cudacpp/ee_mumu.sa/CODEGEN_cudacpp_ee_mumu_log.txt
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ generate e+ e- > mu+ mu-
No model currently active, so we import the Standard Model
INFO: load particles
INFO: load vertices
DEBUG: model prefixing takes 0.005699634552001953 
DEBUG: model prefixing takes 0.005699872970581055 
INFO: Restrict model sm with file models/sm/restrict_default.dat .
DEBUG: Simplifying conditional expressions 
DEBUG: remove interactions: u s w+ at order: QED=1 
Expand Down Expand Up @@ -149,7 +149,7 @@ INFO: Checking for minimal orders which gives processes.
INFO: Please specify coupling orders to bypass this step.
INFO: Trying process: e+ e- > mu+ mu- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
1 processes with 2 diagrams generated in 0.004 s
1 processes with 2 diagrams generated in 0.005 s
Total: 1 processes with 2 diagrams
output standalone_cudacpp ../TMPOUT/CODEGEN_cudacpp_ee_mumu
Load PLUGIN.CUDACPP_OUTPUT
Expand Down Expand Up @@ -177,7 +177,7 @@ ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates FFV2 routines
ALOHA: aloha creates FFV4 routines
ALOHA: aloha creates FFV2_4 routines
ALOHA: aloha creates 4 routines in 0.276 s
ALOHA: aloha creates 4 routines in 0.273 s
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV2
Expand All @@ -196,7 +196,7 @@ INFO: Created files Parameters_sm.h and Parameters_sm.cc in directory
INFO: /data/avalassi/GPU2023/madgraph4gpuX/MG5aMC/TMPOUT/CODEGEN_cudacpp_ee_mumu/src/. and /data/avalassi/GPU2023/madgraph4gpuX/MG5aMC/TMPOUT/CODEGEN_cudacpp_ee_mumu/src/.
quit

real 0m0.775s
user 0m0.619s
sys 0m0.043s
real 0m0.661s
user 0m0.607s
sys 0m0.048s
Code generation completed in 1 seconds
1 change: 1 addition & 0 deletions epochX/cudacpp/ee_mumu.sa/SubProcesses/cudacpp.mk
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ else ifeq ($(BACKEND),hip)
GPUSUFFIX = hip

# Optimization flags
override OPTFLAGS = -O2 # work around "Memory access fault" in gq_ttq for HIP #806: disable hipcc -O3 optimizations
GPUFLAGS = $(foreach opt, $(OPTFLAGS), $(XCOMPILERFLAG) $(opt))

# DEBUG FLAGS (for #806: see https://hackmd.io/@gmarkoma/lumi_finland)
Expand Down
18 changes: 9 additions & 9 deletions epochX/cudacpp/gg_tt.mad/CODEGEN_mad_gg_tt_log.txt
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ generate g g > t t~
No model currently active, so we import the Standard Model
INFO: load particles
INFO: load vertices
DEBUG: model prefixing takes 0.0057220458984375 
DEBUG: model prefixing takes 0.005692720413208008 
INFO: Restrict model sm with file models/sm/restrict_default.dat .
DEBUG: Simplifying conditional expressions 
DEBUG: remove interactions: u s w+ at order: QED=1 
Expand Down Expand Up @@ -150,7 +150,7 @@ INFO: Please specify coupling orders to bypass this step.
INFO: Trying coupling order WEIGHTED<=2: WEIGTHED IS QCD+2*QED
INFO: Trying process: g g > t t~ WEIGHTED<=2 @1
INFO: Process has 3 diagrams
1 processes with 3 diagrams generated in 0.008 s
1 processes with 3 diagrams generated in 0.009 s
Total: 1 processes with 3 diagrams
output madevent_simd ../TMPOUT/CODEGEN_mad_gg_tt --hel_recycling=False --vector_size=32
Load PLUGIN.CUDACPP_OUTPUT
Expand Down Expand Up @@ -183,16 +183,16 @@ INFO: Finding symmetric diagrams for subprocess group gg_ttx
DEBUG: iconfig_to_diag =  {1: 1, 2: 2, 3: 3} [model_handling.py at line 1547] 
DEBUG: diag_to_iconfig =  {1: 1, 2: 2, 3: 3} [model_handling.py at line 1548] 
Generated helas calls for 1 subprocesses (3 diagrams) in 0.006 s
Wrote files for 10 helas calls in 0.074 s
Wrote files for 10 helas calls in 0.073 s
DEBUG: self.vector_size =  32 [export_v4.py at line 7023] 
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates VVV1 set of routines with options: P0
ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates 2 routines in 0.149 s
ALOHA: aloha creates 2 routines in 0.147 s
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates VVV1 set of routines with options: P0
ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates 4 routines in 0.137 s
ALOHA: aloha creates 4 routines in 0.135 s
<class 'aloha.create_aloha.AbstractRoutine'> VVV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
Expand Down Expand Up @@ -228,10 +228,10 @@ Type "launch" to generate events from this process, or see
Run "open index.html" to see more information about this process.
quit

real 0m2.049s
user 0m1.643s
sys 0m0.271s
Code generation completed in 2 seconds
real 0m3.852s
user 0m1.633s
sys 0m0.277s
Code generation completed in 4 seconds
************************************************************
* *
* W E L C O M E to *
Expand Down
1 change: 1 addition & 0 deletions epochX/cudacpp/gg_tt.mad/SubProcesses/cudacpp.mk
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ else ifeq ($(BACKEND),hip)
GPUSUFFIX = hip

# Optimization flags
override OPTFLAGS = -O2 # work around "Memory access fault" in gq_ttq for HIP #806: disable hipcc -O3 optimizations
GPUFLAGS = $(foreach opt, $(OPTFLAGS), $(XCOMPILERFLAG) $(opt))

# DEBUG FLAGS (for #806: see https://hackmd.io/@gmarkoma/lumi_finland)
Expand Down
12 changes: 6 additions & 6 deletions epochX/cudacpp/gg_tt.sa/CODEGEN_cudacpp_gg_tt_log.txt
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ generate g g > t t~
No model currently active, so we import the Standard Model
INFO: load particles
INFO: load vertices
DEBUG: model prefixing takes 0.005597114562988281 
DEBUG: model prefixing takes 0.005693674087524414 
INFO: Restrict model sm with file models/sm/restrict_default.dat .
DEBUG: Simplifying conditional expressions 
DEBUG: remove interactions: u s w+ at order: QED=1 
Expand Down Expand Up @@ -176,7 +176,7 @@ Generated helas calls for 1 subprocesses (3 diagrams) in 0.006 s
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates VVV1 set of routines with options: P0
ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates 2 routines in 0.146 s
ALOHA: aloha creates 2 routines in 0.147 s
<class 'aloha.create_aloha.AbstractRoutine'> VVV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
Expand All @@ -191,7 +191,7 @@ INFO: Created files Parameters_sm.h and Parameters_sm.cc in directory
INFO: /data/avalassi/GPU2023/madgraph4gpuX/MG5aMC/TMPOUT/CODEGEN_cudacpp_gg_tt/src/. and /data/avalassi/GPU2023/madgraph4gpuX/MG5aMC/TMPOUT/CODEGEN_cudacpp_gg_tt/src/.
quit

real 0m0.897s
user 0m0.476s
sys 0m0.056s
Code generation completed in 1 seconds
real 0m1.981s
user 0m0.486s
sys 0m0.050s
Code generation completed in 2 seconds
1 change: 1 addition & 0 deletions epochX/cudacpp/gg_tt.sa/SubProcesses/cudacpp.mk
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ else ifeq ($(BACKEND),hip)
GPUSUFFIX = hip

# Optimization flags
override OPTFLAGS = -O2 # work around "Memory access fault" in gq_ttq for HIP #806: disable hipcc -O3 optimizations
GPUFLAGS = $(foreach opt, $(OPTFLAGS), $(XCOMPILERFLAG) $(opt))

# DEBUG FLAGS (for #806: see https://hackmd.io/@gmarkoma/lumi_finland)
Expand Down
18 changes: 9 additions & 9 deletions epochX/cudacpp/gg_tt01g.mad/CODEGEN_mad_gg_tt01g_log.txt
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ generate g g > t t~
No model currently active, so we import the Standard Model
INFO: load particles
INFO: load vertices
DEBUG: model prefixing takes 0.005770444869995117 
DEBUG: model prefixing takes 0.005487680435180664 
INFO: Restrict model sm with file models/sm/restrict_default.dat .
DEBUG: Simplifying conditional expressions 
DEBUG: remove interactions: u s w+ at order: QED=1 
Expand Down Expand Up @@ -203,23 +203,23 @@ INFO: Finding symmetric diagrams for subprocess group gg_ttx
DEBUG: len(subproc_diagrams_for_config) =  3 [model_handling.py at line 1523] 
DEBUG: iconfig_to_diag =  {1: 1, 2: 2, 3: 3} [model_handling.py at line 1547] 
DEBUG: diag_to_iconfig =  {1: 1, 2: 2, 3: 3} [model_handling.py at line 1548] 
Generated helas calls for 2 subprocesses (19 diagrams) in 0.044 s
Wrote files for 46 helas calls in 0.194 s
Generated helas calls for 2 subprocesses (19 diagrams) in 0.043 s
Wrote files for 46 helas calls in 0.190 s
DEBUG: self.vector_size =  32 [export_v4.py at line 7023] 
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates VVV1 routines
ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates VVVV1 set of routines with options: P0
ALOHA: aloha creates VVVV3 set of routines with options: P0
ALOHA: aloha creates VVVV4 set of routines with options: P0
ALOHA: aloha creates 5 routines in 0.332 s
ALOHA: aloha creates 5 routines in 0.334 s
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates VVV1 routines
ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates VVVV1 set of routines with options: P0
ALOHA: aloha creates VVVV3 set of routines with options: P0
ALOHA: aloha creates VVVV4 set of routines with options: P0
ALOHA: aloha creates 10 routines in 0.319 s
ALOHA: aloha creates 10 routines in 0.316 s
<class 'aloha.create_aloha.AbstractRoutine'> VVV1
<class 'aloha.create_aloha.AbstractRoutine'> VVV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
Expand Down Expand Up @@ -267,10 +267,10 @@ Type "launch" to generate events from this process, or see
Run "open index.html" to see more information about this process.
quit

real 0m2.887s
user 0m2.344s
sys 0m0.296s
Code generation completed in 3 seconds
real 0m2.703s
user 0m2.310s
sys 0m0.313s
Code generation completed in 2 seconds
************************************************************
* *
* W E L C O M E to *
Expand Down
1 change: 1 addition & 0 deletions epochX/cudacpp/gg_tt01g.mad/SubProcesses/cudacpp.mk
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ else ifeq ($(BACKEND),hip)
GPUSUFFIX = hip

# Optimization flags
override OPTFLAGS = -O2 # work around "Memory access fault" in gq_ttq for HIP #806: disable hipcc -O3 optimizations
GPUFLAGS = $(foreach opt, $(OPTFLAGS), $(XCOMPILERFLAG) $(opt))

# DEBUG FLAGS (for #806: see https://hackmd.io/@gmarkoma/lumi_finland)
Expand Down
16 changes: 8 additions & 8 deletions epochX/cudacpp/gg_ttg.mad/CODEGEN_mad_gg_ttg_log.txt
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ generate g g > t t~ g
No model currently active, so we import the Standard Model
INFO: load particles
INFO: load vertices
DEBUG: model prefixing takes 0.005703926086425781 
DEBUG: model prefixing takes 0.005557537078857422 
INFO: Restrict model sm with file models/sm/restrict_default.dat .
DEBUG: Simplifying conditional expressions 
DEBUG: remove interactions: u s w+ at order: QED=1 
Expand Down Expand Up @@ -183,22 +183,22 @@ INFO: Finding symmetric diagrams for subprocess group gg_ttxg
DEBUG: iconfig_to_diag =  {1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10, 11: 11, 12: 12, 13: 13, 14: 14, 15: 15} [model_handling.py at line 1547] 
DEBUG: diag_to_iconfig =  {1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10, 11: 11, 12: 12, 13: 13, 14: 14, 15: 15} [model_handling.py at line 1548] 
Generated helas calls for 1 subprocesses (16 diagrams) in 0.038 s
Wrote files for 36 helas calls in 0.123 s
Wrote files for 36 helas calls in 0.130 s
DEBUG: self.vector_size =  32 [export_v4.py at line 7023] 
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates VVV1 routines
ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates VVVV1 set of routines with options: P0
ALOHA: aloha creates VVVV3 set of routines with options: P0
ALOHA: aloha creates VVVV4 set of routines with options: P0
ALOHA: aloha creates 5 routines in 0.332 s
ALOHA: aloha creates 5 routines in 0.335 s
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates VVV1 routines
ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates VVVV1 set of routines with options: P0
ALOHA: aloha creates VVVV3 set of routines with options: P0
ALOHA: aloha creates VVVV4 set of routines with options: P0
ALOHA: aloha creates 10 routines in 0.319 s
ALOHA: aloha creates 10 routines in 0.321 s
<class 'aloha.create_aloha.AbstractRoutine'> VVV1
<class 'aloha.create_aloha.AbstractRoutine'> VVV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
Expand Down Expand Up @@ -239,10 +239,10 @@ Type "launch" to generate events from this process, or see
Run "open index.html" to see more information about this process.
quit

real 0m3.320s
user 0m2.195s
sys 0m0.274s
Code generation completed in 4 seconds
real 0m2.480s
user 0m2.184s
sys 0m0.292s
Code generation completed in 2 seconds
************************************************************
* *
* W E L C O M E to *
Expand Down
1 change: 1 addition & 0 deletions epochX/cudacpp/gg_ttg.mad/SubProcesses/cudacpp.mk
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ else ifeq ($(BACKEND),hip)
GPUSUFFIX = hip

# Optimization flags
override OPTFLAGS = -O2 # work around "Memory access fault" in gq_ttq for HIP #806: disable hipcc -O3 optimizations
GPUFLAGS = $(foreach opt, $(OPTFLAGS), $(XCOMPILERFLAG) $(opt))

# DEBUG FLAGS (for #806: see https://hackmd.io/@gmarkoma/lumi_finland)
Expand Down
10 changes: 5 additions & 5 deletions epochX/cudacpp/gg_ttg.sa/CODEGEN_cudacpp_gg_ttg_log.txt
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ generate g g > t t~ g
No model currently active, so we import the Standard Model
INFO: load particles
INFO: load vertices
DEBUG: model prefixing takes 0.005605220794677734 
DEBUG: model prefixing takes 0.005650758743286133 
INFO: Restrict model sm with file models/sm/restrict_default.dat .
DEBUG: Simplifying conditional expressions 
DEBUG: remove interactions: u s w+ at order: QED=1 
Expand Down Expand Up @@ -179,7 +179,7 @@ ALOHA: aloha creates FFV1 routines
ALOHA: aloha creates VVVV1 set of routines with options: P0
ALOHA: aloha creates VVVV3 set of routines with options: P0
ALOHA: aloha creates VVVV4 set of routines with options: P0
ALOHA: aloha creates 5 routines in 0.334 s
ALOHA: aloha creates 5 routines in 0.331 s
<class 'aloha.create_aloha.AbstractRoutine'> VVV1
<class 'aloha.create_aloha.AbstractRoutine'> VVV1
<class 'aloha.create_aloha.AbstractRoutine'> FFV1
Expand All @@ -199,7 +199,7 @@ INFO: Created files Parameters_sm.h and Parameters_sm.cc in directory
INFO: /data/avalassi/GPU2023/madgraph4gpuX/MG5aMC/TMPOUT/CODEGEN_cudacpp_gg_ttg/src/. and /data/avalassi/GPU2023/madgraph4gpuX/MG5aMC/TMPOUT/CODEGEN_cudacpp_gg_ttg/src/.
quit

real 0m0.887s
user 0m0.734s
sys 0m0.052s
real 0m0.788s
user 0m0.733s
sys 0m0.047s
Code generation completed in 1 seconds
1 change: 1 addition & 0 deletions epochX/cudacpp/gg_ttg.sa/SubProcesses/cudacpp.mk
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ else ifeq ($(BACKEND),hip)
GPUSUFFIX = hip

# Optimization flags
override OPTFLAGS = -O2 # work around "Memory access fault" in gq_ttq for HIP #806: disable hipcc -O3 optimizations
GPUFLAGS = $(foreach opt, $(OPTFLAGS), $(XCOMPILERFLAG) $(opt))

# DEBUG FLAGS (for #806: see https://hackmd.io/@gmarkoma/lumi_finland)
Expand Down
Loading

0 comments on commit f91c156

Please sign in to comment.