Test refactoring of Makefiles #753

hageboeck · 2023-08-17T15:13:12Z

Based on Jorgen's work to separate builds with different vectorisation levels and cuda into dedicated targets and build folders, here we completely change the strategy how makefiles are used.

TODOs:

General description (might need to be updated before merge):

Include rules from the C++ Makefile into the Fortran Makefile
--> Now, only one make is invoked to build both C++ and Fortran executables (or fortran+cuda for the cuda build)
Move common setup stuff into Source/make_opts
--> Setup such as architecture selection and compiler things don't have to be repeated in each Makefile
Allow for overriding flags from outside for every compiler. Internal
flags are now called MG_CXXFLAGS, and CXXFLAGS are appended to the
compiler invocation. Similar flags exist for FC and NVCC.
- This means that you can easily do make CXXFLAGS="-O0 -g" cppavx2 to debug C++ with avx2 (or export that variable for an entire session) without breaking the build system.
- With this, we follow the makefile convention that the flags that the user should set for compiler XX are called XXFLAGS, and these flags follow the internal flags that the Makefile specifies.
Every architecture-specific C++/cuda build now ends up in its own
directory. This includes the library, object files and all executables.
In order to do an architecture-specific build, the Makefile invokes
itself with the appropriate flags. This allows for parallel
architecture-specific builds, and greatly simplifies the Makefiles.
make cppavx2 will e.g. do something like $(make) AVX=avx2 AVXFLAGS=<...> cpp
The cuda build is now completely separated from SIMD-enabled builds
A new default target has been added, "cppnative", which builds
-march=native if the user doesn't specify anything else.
The main targets are now:
- make cuda
- make cpp{sse4,avx2,avx512y,avx512z}
- make cppall for all of the above
- make ALL for all cpp and cuda
- make cppnative
- ... names to be discussed
Clean targets have been simplified.
- "clean" cleans an entire subprocess,
- "cleanavx2" cleans only the architecture-specific build in the subprocess,
- "cleanall" cleans all subprocesses.
- "distclean" cleans everything

oliviermattelaer · 2023-08-30T13:45:19Z

will comment here (another?) issue with the makefile that we might want to handle at the same time.

If I try to compile the pure fortran code (on my mac). First error is:

[P1_uux_ttx]$ make madevent_fortran
/bin/bash: -c: line 0: syntax error near unexpected token `&'
/bin/bash: -c: line 0: `/Library/Developer/CommandLineTools/usr/bin/make  -f cudacpp.mk -pn |& awk '/Building/{print $3}' | sed s/BUILDDIR=//'
makefile:56: *** CUDACPP_BUILDDIR='' should not be empty!.  Stop.
[P1_uux_ttx]$ make madevent_cudacpp
/bin/bash: -c: line 0: syntax error near unexpected token `&'
/bin/bash: -c: line 0: `/Library/Developer/CommandLineTools/usr/bin/make  -f cudacpp.mk -pn |& awk '/Building/{print $3}' | sed s/BUILDDIR=//'
makefile:56: *** CUDACPP_BUILDDIR='' should not be empty!.  Stop.

obviously having CUDACPP_BUILDDIR empty is reasonable in this particular case.

second error (after removing such check by hand) is

ld: unknown option: --no-relax

which is related to avoid 'failed to convert GOTPCREL relocation' error #458

UPDATE

problem was a bash syntax not supported on all terminal/bash version

This reverts commit e38570f.

…ile and added cpp targets

…e in AVX assignment

Now that C++ and cuda build folders are separate, the test cases have to be instantiated on every compiler pass.

clang-format 16 was not accepting the macros at the end of runTest.cc

In order to generate a correct patch for madgraph4gpu, the generated code cannot be patched. Therefore, the pre-patch step is removed here. The desired change (-O3 -ffast-math -fbounds-check) is absorbed into patch.common

…2py patching from generateAndCompare.sh to patchMad.sh Before this commit, codegen was now failing on itscrd90 as it was previously failing on itscrd80 (patchMd.sh fails due to f2py) After this commit, codegen is succeeding again on itscrd90, and now also on itscrd80 (fixing the issue in PR madgraph5#753)

valassi · 2023-12-09T11:27:40Z

Hi @hageboeck thanks for all this, I started having a look.

The first thing that I think needs solving is that the CI fails. In the codegen, many of my new actions fail patchMad.sh. I actually realised that this was also happening across my nodes: on el9 the codegen was ok, but not on el8. So I added a fix in WIP PR #797. However in that PR my CI also fails, so I guess there is still more to be done. I will investigaete more

valassi · 2023-12-09T11:53:18Z

Hi again, I think I fixed the make_opts issue.

The CI however now fails because we need clang format 16 and this is not available in the CI. You have an easy way to do that (without external actions from outside official github, preferably)? Otherwise, do we really need clang format 16? You changed runTest.cc formatting for that, but I was using clang format 13 previously. (An alternative is to not require 16, and disable formatting in runTest.cc)

valassi · 2023-12-09T12:31:53Z

Ok now clang-format-16 is fixed

Next issues

fix 'make avxall' for something else in the CI scripts

(One more issue would be caching clang-16... but in the end I removed the need for clang-16, now 13 is ok, and I protected runTest.cc)

valassi · 2023-12-09T13:23:21Z

Hi @hageboeck I will stop here, sorry.

The main issue is that make -j fails

ccache gfortran -I. -fPIC -w -ffixed-line-length-132 -O3 -ffast-math -fbounds-check -c readgrid.f -o readgrid.o
dsample.f:1090:10:

 1090 |       use DiscreteSampler
      |          1
Fatal Error: Cannot open module file ‘discretesampler.mod’ for reading at (1): No such file or directory

I had a look, you removed those (ugly, I admit) .lib and replaced by explicit discretesampler.mod. Can you please have a look yourslef and try to fix make -j? Thanks Andrea

hageboeck · 2023-12-12T15:35:23Z

The main issue is that make -j fails

ccache gfortran -I. -fPIC -w -ffixed-line-length-132 -O3 -ffast-math -fbounds-check -c readgrid.f -o readgrid.o
dsample.f:1090:10:

 1090 |       use DiscreteSampler
      |          1
Fatal Error: Cannot open module file ‘discretesampler.mod’ for reading at (1): No such file or directory

Hello @valassi,

I had a look. Long story short:
Not my fault, the bug (or shortcoming I guess) is in ccache.

In a bit more detail:
The missing file is the side effect of a Fortran compilation that ccache is unaware of. ccache cannot deal with fortran modules (or Fortran at all; they reverted Fortran support in the linked discussion. No idea if they re-reverted this at some point).

The module that's missing above gets built in the following way:
DiscreteSampler.f --> DiscreteSampler.o + discretesampler.mod --> dsample.o

See this demonstration:

$ make dsample.o
gfortran -I. -fPIC -w -ffixed-line-length-132 -O3 -ffast-math -fbounds-check -c DiscreteSampler.f -o DiscreteSampler.o
gfortran -I. -fPIC -w -ffixed-line-length-132 -O3 -ffast-math -fbounds-check -c dsample.f -o dsample.o

The makefile implements the dependency via object files, and hopes that the modules drop out correctly. I never touched that part:
https://github.com/hageboeck/madgraph4gpu/blob/96f9ba777db50a18ba2c4049b4398e8abd591213/epochX/cudacpp/gg_tt.mad/Source/makefile#L77-L78

It's just that ccache doesn't create modules. Since I don't use ccache for madgraph, I never saw any problems. I can clean and make -j in Source in a loop, and it always builds correctly.

I don't know what the way forward is. You could disable ccache for Fortran, or you could add special make rules for Fortran compilations that also generate modules.

valassi · 2023-12-12T17:39:52Z

Hi @hageboeck thanks. Various points.

It seems I am unable myself to reproduce 'make -j' now, very strange. Maybe I had made a mistake? Or maybe I got into some strange configuration which in any case would have failed in my older code too? Anyway, unless I see this again, I guess there is nothing to fix for that.
I was aware about ccache not supporting fortran. But I do not understand how this could influence 'make -j': if ccache works, it simply means that it produces a .o (or other things, .a, .mod etc) from cache rather than from a real build, and if it does not work, it creates it from a real build. But if the build product is there already, it is the makefile that does not trigger anything, and ccache is not involved at all? (As I said above I seem to be unable to reproduce 'make -j' issues, so I guess this is irrelevant, but I mention this anyway).

Anyway: I agree, for clarity it is probably better to remove ccache from gfortran. I have done this in codegen in my branch.

More importantly now, from other small tests, I do not understand how 'make clean*' is supposed to do. Previously 'make cleanall' was cleaning both Source and local P1 (and mybe mre than one P1), now it does not clean src? And in any case if I do both 'make cleanall' and 'make cleansrc' I am stuck with a build,native empty directory?

Rephrasing:

Clean targets have been simplified.

* "clean" cleans an entire subprocess,

* "cleanavx2" cleans only the architecture-specific build in the subprocess,

* "cleanall" cleans all subprocesses.

* "distclean" cleans everything

What I miss here is (1) something that cleans also src and Source (distclean is too m,uch, it also cleans googletest) and (2) the fact that build.* are not removed for me on linux.

And by the way I think we do not need cleanavx2?

valassi · 2023-12-12T17:49:18Z

I am just comparing what these were doing before

For make clean
Cleans the local P* subdirectory. Does not clean all other P*, does not clean src/lib/Source.
For make cleanall
Cleans the local P* subdirectory. all other P*, and src/lib/Source.
For make distclean
In addition cleans also googletest

I would keep something similar: a simple 'make cleanall' to clean also src/lib/Source.

Or we can discuss it. It is clearly "ugly" that "make" also makes src/Source/lib while "make clean" does not clean that? And we maybe need a way to make all P* subdirectories too...

hageboeck · 2023-12-13T10:10:56Z

I am just comparing what these were doing before

For make clean
Cleans the local P* subdirectory. Does not clean all other P*, does not clean src/lib/Source.

It's doing the same now; just leaving the directories intact. Maybe you want me to remove the directories, too, but I didn't do that because you might have profiling results or other data in those. I just remove build artifacts, so it doesn't destroy something that you would like to keep. Can be discussed.

For make cleanall
Cleans the local P* subdirectory. all other P*, and src/lib/Source.

It does the same except for cleaning src/Source. For those, there is cleansrc, which I found somewhere in Jørgen's work, so I left that intact.
This could be included in the cleanall target with

- cleanall:
+ cleanall: cleansrc

Say it and I add this.

For make distclean
In addition cleans also googletest

I would keep something similar: a simple 'make cleanall' to clean also src/lib/Source.

Or we can discuss it. It is clearly "ugly" that "make" also makes src/Source/lib while "make clean" does not clean that? And we maybe need a way to make all P* subdirectories too...

For that, there would be the meta-Makefile in <processName>.mad. [Edit: It's not in the PR, but Stefan and me mentioned that. Let me add it in the next minutes.] I think it's more logical to keep the makefiles in subdirectories "local" in the sense that they deal only with the local stuff. (Of course we break that rule by building Source/src/lib, but it wouldn't compile otherwise).
But to build everything everything, I guess it's more logical to do:

cd gg_tt.mad
make -j cppavx2
make clean

and this would build and clean all Source/src/lib and all P*. What do you think?

Zenny's and Filips work showed that the colour matrix can run in single precision without impact on the results. Make this the default in the makefiles now.

- clean will now also remove build directories when they are empty. - cleanall will also invoke cleansrc to clean the source directory.

hageboeck · 2023-12-13T13:47:39Z

Hello @valassi,

just pushed the following amendments:

clean wipes the build directories unless they contain some user-created files.
cleanall now also cleans Source/src/lib
And make mixed-precision mode the default since the colour matrix is sufficiently accurate as we discussed in person a few times.

valassi · 2023-12-13T19:35:00Z

Thanks Stefan I will have a look!

Listen at least some of the patches I have in #797 are also needed, how do you prefer that we proceed? Maybe I make a subset and make them as a PR against your branch? Or I tell you which ones to cherry-pick?

Thanks
Andrea

…2py patching from generateAndCompare.sh to patchMad.sh (This is an old commit from 9 Dec but the description is still valid on 14 Dec) Before this commit, codegen was now failing on itscrd90 as it was previously failing on itscrd80 (patchMd.sh fails due to f2py) After this commit, codegen is succeeding again on itscrd90, and now also on itscrd80 (fixing the issue in PR madgraph5#753)

valassi · 2023-12-14T08:29:01Z

@hageboeck thanks I am having a look.

A couple of points

I am now recreating/retesting the patches I had made in WIP: refactor makefiles (additions to Stephan's and Joorgen's PRs) #797 to fix CODEGEN in the CI and similar issues. Can you please wait and then apply my patches on top of yours before doing other changes?
In particular, please do not do forced pushes now on what you have so far.
I checked 'make cleanall': it gives an error (had you tested that? maybe it succeeds on mac but must be tested on linux?). Anyway, I will also fix that.

ls: cannot access 'build.*': No such file or directory
rm: cannot remove 'build.*': No such file or directory
make[1]: *** [makefile:224: clean] Error 1
make[1]: Leaving directory '/data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx'
make: *** [makefile:227: cleanall] Error 2

More later!
Thanks

hageboeck · 2023-12-14T08:49:27Z

I am now recreating/retesting the patches I had made in WIP: refactor makefiles (additions to Stephan's and Joorgen's PRs) #797 to fix CODEGEN in the CI and similar issues. Can you please wait and then apply my patches on top of yours before doing other changes?

In particular, please do not do forced pushes now on what you have so far.

I checked 'make cleanall': it gives an error (had you tested that? maybe it succeeds on mac but must be tested on linux?). Anyway, I will also fix that.

Interesting. Of course I will wait, because I know what conflict resolution feels like in this project.

Also interesting that you ask whether I tested that. Of course I tested! Both on mac and linux. I do nothing without testing it, but of course I might overlook specific workflows. In this case, this seems to be cleaning without having a build directory. The fix is easy, but I see that you are already doing that, so will wait now as promised.

valassi · 2023-12-14T08:52:41Z

Also interesting that you ask whether I tested that. Of course I tested! Both on mac and linux. I do nothing without testing it, but of course I might overlook specific workflows. In this case, this seems to be cleaning without having a build directory. The fix is easy, but I see that you are already doing that, so will wait now as promised.

Thanks Stephan! Yes I imagine you tested... which is even more surprising. Maybe you tested only with a clean dir? The failure is only when build.* have already been deleted. This is because for d in build.*; do if there is NO directory called build.* then expands to literal build.*, which is not a directory. I just learnt this has to do with bash "nullglob" and "failglob" settings etc

hageboeck self-assigned this Aug 17, 2023

hageboeck force-pushed the refactorMakefile branch from 23865af to 366ddfc Compare August 17, 2023 15:28

Jooorgen added 27 commits September 27, 2023 17:02

Moved compilation of CUDA to seperate cuda target

792080e

Fixed indentation in the makefile

c945deb

Removed all unneccessary NVCC ifneq and redid indentation

36a5fa7

Convert all spaces to tabs in cudacpp.mk file

0b95fb7

Revert "Convert all spaces to tabs in cudacpp.mk file"

360d908

This reverts commit e38570f.

Changed line to be interpreted in bash into spaces

575840c

Changed indentation again

19911fb

Fixed indentation in start of CUDA compilation

f4366a9

Fixed indentation in fcheck compilation

b819498

Changed indentation of ifndef in CUDA compilation

9082da0

Changed indentation back to tabs again for some lines

bfd4301

Changed indentation again to fix endif error

529318c

Revert back cudacpp.mk to original state

fc549ca

Added cuda target

bd0df54

Added usage target if user doesent speficy target when invoking makef…

df98ffa

…ile and added cpp targets

Added .DEFAULT_GOAL so usage target will be run correctly

f822e93

Added targets to cpp build

63bb3b5

Added correct cpp targets

b5758b0

Added $(MAKECMDGOALS) ifeq to CUDA compiler configuration

eb3a706

Removed automatic configuration of AVX level

b23ada0

Fixed correct AVX level depending on C++ compiler

683e641

Reworked configuration of setting AVX level

fca0b73

Fixed missing endif and removed AVX configuration in cuda compilation

489c453

Fixed AVX not being properly set

01b7da3

Fixed cuda compilation in cudacpp_src.mk and fixed trailing whitespac…

e734f49

…e in AVX assignment

Added else case in AVX configuration

35d6aee

Port changes to BUILDDIRTAG to cudacpp_src.mk

b611e9e

hageboeck added 3 commits November 30, 2023 16:25

[CODEGEN] Instantiate madgraph test cases also for CUDA builds.

eec9d56

Now that C++ and cuda build folders are separate, the test cases have to be instantiated on every compiler pass.

[CODEGEN] Apply clang-format to runTest.cc

7452bbe

clang-format 16 was not accepting the macros at the end of runTest.cc

[CODEGEN] Remove level 0 pre-patches from generateAndCompare

99c2758

In order to generate a correct patch for madgraph4gpu, the generated code cannot be patched. Therefore, the pre-patch step is removed here. The desired change (-O3 -ffast-math -fbounds-check) is absorbed into patch.common

hageboeck force-pushed the refactorMakefile branch from 08ac94d to 96f9ba7 Compare November 30, 2023 15:28

valassi mentioned this pull request Dec 9, 2023

WIP: refactor makefiles (additions to Stephan's and Joorgen's PRs) #797

Closed

hageboeck added 5 commits December 13, 2023 14:36

[CODEGEN] Make mixed-precision computations the default.

c85de5d

Zenny's and Filips work showed that the colour matrix can run in single precision without impact on the results. Make this the default in the makefiles now.

[CODEGEN] Refine clean targets.

23d6470

- clean will now also remove build directories when they are empty. - cleanall will also invoke cleansrc to clean the source directory.

Regenerate all *.mad processes.

1904d74

[CI] Fix Mac build targets for CI runs.

5773856

[CI] Update debug build configuration.

629d5aa

hageboeck force-pushed the refactorMakefile branch from 96f9ba7 to 629d5aa Compare December 13, 2023 13:44

valassi mentioned this pull request Dec 15, 2023

Jorgen's makefiles with separate builds for CUDA, HIP and C++ #798

Merged

hageboeck closed this Dec 15, 2023

oliviermattelaer reopened this Dec 15, 2023

valassi mentioned this pull request Jul 10, 2024

Enable -Wall warnings in cuda builds (and fix all warnings) #902

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test refactoring of Makefiles #753

Test refactoring of Makefiles #753

hageboeck commented Aug 17, 2023 •

edited

Loading

oliviermattelaer commented Aug 30, 2023 •

edited

Loading

valassi commented Dec 9, 2023

valassi commented Dec 9, 2023

valassi commented Dec 9, 2023 •

edited

Loading

valassi commented Dec 9, 2023 •

edited

Loading

hageboeck commented Dec 12, 2023

valassi commented Dec 12, 2023 •

edited

Loading

valassi commented Dec 12, 2023

hageboeck commented Dec 13, 2023 •

edited

Loading

hageboeck commented Dec 13, 2023 •

edited

Loading

valassi commented Dec 13, 2023

valassi commented Dec 14, 2023

hageboeck commented Dec 14, 2023

valassi commented Dec 14, 2023 •

edited

Loading

Test refactoring of Makefiles #753

Are you sure you want to change the base?

Test refactoring of Makefiles #753

Conversation

hageboeck commented Aug 17, 2023 • edited Loading

TODOs:

General description (might need to be updated before merge):

oliviermattelaer commented Aug 30, 2023 • edited Loading

UPDATE

valassi commented Dec 9, 2023

valassi commented Dec 9, 2023

valassi commented Dec 9, 2023 • edited Loading

valassi commented Dec 9, 2023 • edited Loading

hageboeck commented Dec 12, 2023

valassi commented Dec 12, 2023 • edited Loading

valassi commented Dec 12, 2023

hageboeck commented Dec 13, 2023 • edited Loading

hageboeck commented Dec 13, 2023 • edited Loading

valassi commented Dec 13, 2023

valassi commented Dec 14, 2023

hageboeck commented Dec 14, 2023

valassi commented Dec 14, 2023 • edited Loading

hageboeck commented Aug 17, 2023 •

edited

Loading

oliviermattelaer commented Aug 30, 2023 •

edited

Loading

valassi commented Dec 9, 2023 •

edited

Loading

valassi commented Dec 9, 2023 •

edited

Loading

valassi commented Dec 12, 2023 •

edited

Loading

hageboeck commented Dec 13, 2023 •

edited

Loading

hageboeck commented Dec 13, 2023 •

edited

Loading

valassi commented Dec 14, 2023 •

edited

Loading