Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include mg5amcnlo in madgraph4gpu as a git submodule #752

Merged
merged 19 commits into from
Aug 16, 2023

Conversation

valassi
Copy link
Member

@valassi valassi commented Aug 16, 2023

Hi @oliviermattelaer @hageboeck @roiser @zeniheisser @Jooorgen
I have completed the inclusion of mg5amcnlo in madgraph4gpu as a git submodule.

This MR is ready to be merged... but please let me know if you have comments or would have liked some things differently.

This completes our "step 1" agreed last week

  • Step 1, mg5amcnlo is a submodule of madgraph4gpu, while cudacpp is still a directory in the madgraph4gpu repo
  • Step 2, cudacpp is a separate repo, included as a submodule in mg5amcnlo, which is itself a submodule of madgraph4gpu

For info, what I personally did was the following

  • by default, mg5amcnlo will be in detached HEAD state with a given commit (no branch)
  • I went and added a remote for my fork, and then checked out the valassi/gpucpp branch from my fork as a local "valassi_gpucpp" branch which contains the commit. Then updating the module must be done with the merge option, git submodule update --remote --merge.
    See section "Working on a Submodule" in https://git-scm.com/book/en/v2/Git-Tools-Submodules

I am also progressing on step2, i.e. the move of the codegen plugin to a separate repo (the only tricky bit is I am keeping the commit history for that and will try to change the links to existing github issues to point to madgraph4gpu - and then making a few tests to understand how to continue porting codegen changes in existing MRs like HIP). It should be ready in one day or two I think.

Thanks
Andrea

PS I take it back slightly (see https://github.com/madgraph5/madgraph4gpu/pull/752/files)

  • the commit is certainly fixed in the submodule at "Submodule mg5amcnlo added at 941832"
  • however the .gitmodule file also contains gpucpp
    So I guess that a new clone will checkout gpucpp and use the specific commit there? I will check

cd <top>
git submodule set-branch --branch gpucpp MG5AMC/mg5amcnlo

(NB: the module name is MG5AMC/mg5amcnlo, you should not run this command with argument mg5amcnlo from directory MG5AMC)
(NB: 'git config -l' shows no difference)
git submodule update --remote
Submodule path 'MG5aMC/mg5amcnlo': checked out '941832b9b03213da8ab994427add396673820799'
…er files showing directory names

In proc_card_mg5.dat
-output madevent CODEGEN_mad_gg_tt ...
+output madevent ../TMPOUT/CODEGEN_mad_gg_tt ...

In me5_configuration.txt
-#mg5_path = /data/avalassi/GPU2023/MG5aMC/ghav-mg5amcnlo
+#mg5_path = /data/avalassi/GPU2023/madgraph4gpuMOD/MG5aMC/mg5amcnlo

In py3_model.pkl
(binary files differ)

NOTE: I also added gg_tt.mad/bin/internal/plugin_run_card which seems to be a new file?
This is probably something added by OLivier after last week's discussions, it contains
{'name':'cudacpp_backend', 'value':'CPP', 'include':False, 'hidden':False}
… by git submodules

Note: they were pointing to
branch.GIT = gpucpp
commit.GIT = 941832b9b

Note: 'git show b145ba4' gives
Submodule MG5aMC/mg5amcnlo 1a13d8f53..941832b9b:
That is to say, the submodule points to the same commit (by default in detached HEAD state, unless users checkout a specific branch)
@valassi
Copy link
Member Author

valassi commented Aug 16, 2023

I have discussed this briefly with @roiser and @hageboeck in person.

There should not be any major issue. All CI tests passed, I will self merge.

Thanks again to @hageboeck for all the ideas on which this is based!

@valassi valassi merged commit bd255c0 into madgraph5:master Aug 16, 2023
21 checks passed
@roiser
Copy link
Member

roiser commented Aug 16, 2023

There is one situation where we would need to be careful, i.e. when switching the submodule to a another fork. E.g. I have checked out the MG5aMC with a branch of my (origin) fork

roiser@itscrd-a100:mg5amcnlo (roiser_gpucpp)$ pwd
/afs/cern.ch/work/r/roiser/sw/madgraph4gpu/MG5aMC/mg5amcnlo
roiser@itscrd-a100:mg5amcnlo (roiser_gpucpp)$ git remote -vv
origin	https://github.com/roiser/mg5amcnlo (fetch)
origin	https://github.com/roiser/mg5amcnlo (push)
upstream	https://github.com/mg5amcnlo/mg5amcnlo (fetch)
upstream	https://github.com/mg5amcnlo/mg5amcnlo (push)
roiser@itscrd-a100:mg5amcnlo (roiser_gpucpp)$ git branch -vv
  3.x           1a13d8f53 [upstream/3.x] merge with 2.9.16
  gpucpp        cb4771efc [upstream/gpucpp] In .gitignore, replace PLUGIN/CUDACPP_SA_OUTPUT by PLUGIN/CUDACPP*OUTPUT (the plugin will be installed as CUDACPP_OUTPUT) in the future
* roiser_gpucpp 74f91f0af [origin/gpucpp] force finalize for second exporter
roiser@itscrd-a100:mg5amcnlo (roiser_gpucpp)$ 

this will create a diff in the "outer" repository for the submodule link

roiser@itscrd-a100:MG5aMC (master)$ pwd
/afs/cern.ch/work/r/roiser/sw/madgraph4gpu/MG5aMC
roiser@itscrd-a100:MG5aMC (master)$ git diff
diff --git a/MG5aMC/mg5amcnlo b/MG5aMC/mg5amcnlo
index cb4771efc..74f91f0af 160000
--- a/MG5aMC/mg5amcnlo
+++ b/MG5aMC/mg5amcnlo
@@ -1 +1 @@
-Subproject commit cb4771efc507953c5ae03320ef5ee5496fabbff3
+Subproject commit 74f91f0affaceeeeb19e98dc862f379eebfe8598
roiser@itscrd-a100:MG5aMC (master)$ 

--> we should be careful to never commit/push this link e.g. via a PR, otherwise this wrong link will end up in the upstream repository pointing to (in this case) my fork for the submodule

@valassi
Copy link
Member Author

valassi commented Aug 16, 2023

Hi Stefan thanks. Ah ok now I understand what you mean. We had a misunderstanding between two cases.

  1. You are right, if you checkout a different branch, and this branch has a different commit than gpucpp, then you risk committing the wrong commit hash. But if your branch has excatly the same content as gpucpp, then the commit is the same, and this should not change, can you check?

  2. What I meant was excatly, that your branch has the same content as the commit hash. In that case, what I had understood from you was that we would risk changing the branch name in .gitmodules (not the commit hash in the submodule). In my understanding, this should NOT happen. Can you check?

valassi added a commit to valassi/mg5amcnlo_cudacpp that referenced this pull request Aug 16, 2023
valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this pull request Aug 16, 2023
@valassi
Copy link
Member Author

valassi commented Aug 16, 2023

This is logically part oif #661, even if it is a slightly different issue. Step 2 will be a proper part of issue #661.

@roiser
Copy link
Member

roiser commented Aug 17, 2023

Hi Stefan thanks. Ah ok now I understand what you mean. We had a misunderstanding between two cases.

1. You are right, if you checkout a different branch, and this branch has a different commit than gpucpp, then you risk committing the wrong commit hash. But if your branch has excatly the same content as gpucpp, then the commit is the same, and this should not change, can you check?

yes you are right, if the two branches "upstream" and "origin" are at the same stage the commit hash is also the same, so there won't be a problem

The only possible problem left I see is if they diverge then we should understand if one does two PRs one for the submodule and one for the commit hash outside or we "adjust" the commit hash in a different way once the submodule PR has been merged.

2. What I meant was excatly, that your branch has the same content as the commit hash. In that case, what I had understood from you was that we would risk changing the branch name in .gitmodules (not the commit hash in the submodule). In my understanding, this should NOT happen. Can you check?

no that's ok, no problem here.
thanks Stefan

@valassi
Copy link
Member Author

valassi commented Aug 17, 2023

Hi, good, thanks for checking.

The only possible problem left I see is if they diverge then we should understand if one does two PRs one for the submodule and one for the commit hash outside or we "adjust" the commit hash in a different way once the submodule PR has been merged.

Hm I think that unfortunately we will need always two MRs anyway, one on madgraph4gpu and one on gpucpp. These are two different repos, I do not think we can combine change sto two repos in a single MR?

If indeed we need two MRs. then I guess it is better to first merge mg5amcnlo and only later merge madgraph4gpu, as the latter contains the hash to the former, not viceversa. This is almost exactly what I had with my commit.GIT anyway. Sometimes I was making changes in mg5amcnlo gpucpp in my fork and create a MR (so the hash would be in my fork anyway) and change madgraph4gpu even if that commit of gpucpp was not yet merged. But we should try to avoid it.

So I guess, ideally, first push/merge on mg5amcnlo gpucpp so that the commit hash is there, and then push/merge the reference to that hash in madgraph4gpu. Then sometimes there can be exceptions/deviations...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants