-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP add pp_tt to repo (plus obsolete fixes for bug 872 via reset_cumulative_variable, now fixed by Olivier via a single helicity filter) #935
base: master
Are you sure you want to change the base?
Conversation
NB: there are big differences between these two directories diff pp_tt012j.mad/SubProcesses/P2_gu_ttxgu/ gu_ttgu.mad/SubProcesses/P1_gu_ttxgu/
…s not show up here?...) ./tmad/teeMadX.sh -guttgu -makeclean +10x
…5#872 in pp_tt012j (with q = u c d s u~ c~ d~ s~)
NB1: there are big differences between these two directories diff pp_tt012j.mad/SubProcesses/P2_gu_ttxgu/ gq_ttgq.mad/SubProcesses/P1_gu_ttxgu/ NB2: there are some differemces between these two directories, but not that many (the main difference is MAXPROC=4 vs MAXPROC=1) diff gu_ttgu.mad/SubProcesses/P1_gu_ttxgu/ gq_ttgq.mad/SubProcesses/P1_gu_ttxgu/
…does not show up here?...) ./tmad/teeMadX.sh -gqttgq -makeclean +10x
…aph5#872 in pp_tt012j, remove antiquarks (q = u c d s)
…rks - P1_gux_ttxgux disappears, but P1_gu_ttxgu is identical
…madgraph5#872 in multi-leg pp_tt012j
NB: the two following directories are essentially identical diff pp_tt012j.mad/SubProcesses/P2_gu_ttxgu/ pp_ttjj.mad/SubProcesses/P1_gu_ttxgu/
…l (bug madgraph5#872) ./tmad/teeMadX.sh -pptt012j -makeclean +10x ./tmad/teeMadX.sh -ppttjj -makeclean +10x HOWEVER - strangely enough, the comparisons with 8192 events give the same cross sections (?the CI fails with 32?!) - the cross sections are different in the two cases pp_tt012j and pp_ttjj for the same subprocess - in particular the numbers of events processed seem different in fortran/cpp and in 012j/jj Look at this in particular diff tmad/logs_pptt012j_mad/log_pptt012j_mad_d_inl0_hrd0.txt tmad/logs_ppttjj_mad/log_ppttjj_mad_d_inl0_hrd0.txt
…rorprocs.inc, try to change MIRRORPROCS from true to false to investigate madgraph5#872 (See also Roiser's comments madgraph5#872 (comment))
…MIRRORPROCS=FALSE: they now both succeed (work around madgraph5#872) ./tmad/teeMadX.sh -pptt012j -makeclean +10x ./tmad/teeMadX.sh -ppttjj -makeclean +10x Note in particular that the numbers of events processed are now the same in Fortran and C++
…(undo PR madgraph5#754) to fix madgraph5#872 This resets MIRROPROCS to FALSE (instead of TRUE) in mirrorprocs.inc files, e.g. in pp_ttjj This fixes the fortran/cudacpp xsec mismatch madgraph5#872 in pp_ttjj Note in particular that the numbers of events processed are now the same in Fortran and C++, while a higher number of events was processed in Fortran without this fix Revert "avoid that mirroring is reset by the plugin" This reverts commit 2556cdd (from PR madgraph5#754) Fix conflicts: epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/model_handling.py
…py to disable mirror processing again (undo PR madgraph5#754) to fix madgraph5#872 This does set MIRRORPROCS to false. But it ALSO sets nprocesses=1 instead of nprocesses=1.
…RORPROCS=FALSE and nprocesses=1: it succeeds again (work around madgraph5#872) The question is, is nprocesses relevant at all in the cudacpp calculations? (PS the answer is, no it is not: it only used for sanity checks... and I will add back the assert expecting nprocesses=1) ./tmad/teeMadX.sh -pptt012j -makeclean +10x ./tmad/teeMadX.sh -ppttjj -makeclean +10x
…ting nprocesses == 1 (part of the fix for madgraph5#872 in PR madgraph5#935; undo changes from PR madgraph5#754 and madgraph5#764) NB: nprocesses is unused in cudacpp, because cudacpp is unable to handle mirror processes! This means that the result of madevent with cudacpp MEs are the same whether nprocesses is 1 or 2. However, nprocesses=2 signals that Fortran is using mirror processes. If mirror processes are used in Fortran MEs but not in cudacpp MEs, this leads to a xsec mismatch madgraph5#872. Implementing mirror processes in cudacpp would be extremely complicated. This is an assumption that was made years ago. To be cross checked with Olivier, but I think that the code generated with cudacpp without mirror processes also ensure that the mirror processes are _separately_ generated as separate subprocesses also in Fortran, therefore the whole calculation is internally consistent if mirrorprocesses is false in both Fortran and cudacpp.
…_assert for nprocesses=1
…rror processied and to add a static_assert for nprocesses=1 (madgraph5#872)
…RORPROCS=FALSE and assert nprocesses=1: it still succeeds (fix madgraph5#872) ./tmad/teeMadX.sh -pptt012j -makeclean +10x ./tmad/teeMadX.sh -ppttjj -makeclean +10x
STARTED AT Wed Jul 31 11:46:06 AM CEST 2024 ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean ENDED(1) AT Wed Jul 31 12:07:50 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean ENDED(2) AT Wed Jul 31 12:15:50 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean ENDED(3) AT Wed Jul 31 12:24:10 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst ENDED(4) AT Wed Jul 31 12:26:53 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst ENDED(5) AT Wed Jul 31 12:29:34 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common ENDED(6) AT Wed Jul 31 12:32:20 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -mix -hrd -makej -susyggtt -susyggt1t1 -smeftggtttt -heftggbb -makeclean ENDED(7) AT Wed Jul 31 12:41:32 PM CEST 2024 [Status=0]
… helicities madgraph5#872) - all as expected (failures only in heft madgraph5#833) STARTED AT Wed Jul 31 12:41:32 PM CEST 2024 (SM tests) ENDED(1) AT Wed Jul 31 05:09:52 PM CEST 2024 [Status=0] (BSM tests) ENDED(1) AT Wed Jul 31 05:21:16 PM CEST 2024 [Status=0] 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_d_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_pptt_mad/log_pptt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_pptt_mad/log_pptt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_pptt_mad/log_pptt_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_m_inl0_hrd0.txt
I have rerun all my manual tests too, I confirm this as good to go, unless we decide on a different strategy |
Ok, so #941 concluded
So I'm working on a PR for the MG5aMC: mg5amcnlo/mg5amcnlo#124 Let discuss next week. |
…5#949 fixing madgraph5#947) into pptt Fix conflicts in epochX/cudacpp/pp_tt.mad/CODEGEN_mad_pp_tt_log.txt (git checkout pptt pp_tt.mad/CODEGEN_mad_pp_tt_log.txt)
Note: there is no change here, which is strange because pp_tt.mad/bin/internal/madevent_interface.py should have changed? Strangely, git blame shows that this is considered as a move of gg_tt01g.mad... git blame pp_tt.mad/bin/internal/madevent_interface.py | grep 947 8a8ee1a epochX/cudacpp/gg_tt01g.mad/bin/internal/madevent_interface.py (Andrea Valassi 2024-08-02 17:28:18 +0200 3970) # avoid case where some file starts with G (madgraph5#947)
Thanks Olivier. On my side I updated this PR with #949 upstream/master. My proposal is the following:
But I would not wait to have all of the above done before fixing these pptt012j tests... I want to see the cross sections identical, and be able to debug CMS without having that issue, too Anyway lets discuss next week! Have a good weekend :-) Andrea |
So my counter proposal is to move directly on this PR: #955 And then move on #950 |
…ier merging git checkout upstream/master $(git ls-tree --name-only upstream/master tmad/logs* tput/logs*)
…r merging git checkout upstream/master $(git ls-tree --name-only upstream/master */CODEGEN*txt)
…g_tt.mad, to ease merging and conflict resolution (From the cudacpp directory) git checkout upstream/master $(git ls-tree --name-only upstream/master *.mad *.sa | grep -v ^gg_tt.mad)
…ase merging (prevent git from thinking that pp_tt is its renaming...)
…mad to the repo ./CODEGEN/generateAndCompare.sh -q gg_tt01g --mad
…ad) from upstream/master, except gg_tt.mad, to ease merging and conflict resolution (From the cudacpp directory) git checkout upstream/master $(git ls-tree --name-only upstream/master *.mad *.sa | grep -v ^gg_tt.mad)
… mac madgraph5#974, nvcc madgraph5#966) into pptt Fix conflicts: epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1 (take HEAD version, must recompute) epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f (fix manually)
I will look at 955 from Olivier, but in the meantime I resynced this to the latest upstream/master |
…ith upstream/master
…/master The only files that still need to be patched are - 3 in patch.common: Source/makefile, Source/genps.inc, SubProcesses/makefile - 3 in patch.P1: auto_dsig1.f, driver.f, matrix1.f ./CODEGEN/generateAndCompare.sh gg_tt --mad --nopatch git diff --no-ext-diff -R gg_tt.mad/Source/makefile gg_tt.mad/Source/genps.inc gg_tt.mad/SubProcesses/makefile > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common git diff --no-ext-diff -R gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1 git checkout gg_tt.mad
I renamed this and marked this as WIP. The initial focus of this PR (fixing 872 via reset_cumulative_variable) is obsolete. The issue has been fixed by Olivier in his #955, which will be included in the upcoming #992. I keep this open just in case I want to add pp_tt (temporarely?) to the repo. Low priority, will have a look after #992. |
This is a WIP PR to investigate the pp_tt012j bug 872