Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: separate cpu/gpu namespaces and fix runtest segmentation fault #728

Closed
wants to merge 22 commits into from

Conversation

valassi
Copy link
Member

@valassi valassi commented Jul 19, 2023

This is a WIP MR that fixes #725, ie the segfault in runtest.

It is a complete fix in ggtt.sa. I keep it in WIP hoever and will probably close it, because I have other MRs that may soon be committed and which this will also modify a lot (@Jooorgen 's HIP MR #718 and my FPE MR # 723, which is actually where I first noticed this issue).

I will most likely merge this into #723. And then discuss with Jorgen which one of HIP or FPE should go in first.

Note: this patch essentially extends the use of the two separate namespaces for CPU and GPU builds (as discussed in #318). While I am still not convinced that it is a good idea to mix both types of builds in a single executable (which is what was causing the segfault #725), at least now this is a bit safer. I am not sure if all issues are fixed, but ceratnly it is much better. A few details

  • Now fptype_sv and cxtype_sv, and also cxtype, are defined in either a CPU or GPU namespace, where they have different definitions
  • In Segfault in testxxx runTest.exe for debug builds (need separate cpu/gpu namespaces) #725 I was reporting not only a segfault, but also a strange mixup where some tests seemed to mix data arrays with different strutures, which is indeed probably a side effect of having different vector sizes mixed
  • I have no idea why we have not observed these errors before, and why they only show up in debug builds...
  • I have also moved to separate GPU/CPU namespaces the Parameters class. For this I had to also build the Parameters class in CUDA in src.
  • The CUDA build in src is done by simply exporting NVCC. This may be a good ides to simplify makefiles eventually in src also for AVX etc.

The build fails because maskand is also defined in testmisc.cc
So far I was never building for CUDA - now the build succeeds, and the test too!
… changes in mgOnGpuConfig.h and mgOnGpuVectors.h
… in src (export NVCC to simplifiy the config; also remove __device__ from operator<<)
…nt namespaces

This CONCLUDES the cleanup of namespaces in ggtt.sa: everything builds and runs ok

NB: in debug mode, now runTest succeeds! As intended, this fixes the segfault in madgraph5#725
@valassi
Copy link
Member Author

valassi commented Jul 19, 2023

This is now merged into #723. I am closing this.

@valassi valassi closed this Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segfault in testxxx runTest.exe for debug builds (need separate cpu/gpu namespaces)
1 participant