Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelization for PET reconstruction of multiple frames leads to error #1283

Open
fdellekart opened this issue Aug 1, 2024 · 4 comments
Open

Comments

@fdellekart
Copy link

fdellekart commented Aug 1, 2024

Hello,

I am currently setting up a reconstruction for an fPET application, where I need to reconstruct a series of individual frames from listmode data of a Biograph mMR system.

I am using a docker container built on top of the latest version of ghcr.io/synerbi/sirf and for a single frame the reconstruction works just fine. To speed up the processing and utilize the machine properly, I tried to parallelize the reconstruction of individual frames using pythons multiprocessing module. You can find my script here.

However, it looks like the software is accessing some shared state in the background because only one of the frames works and I get the following error in the setup of the ScatterEstimator:

 sirf.Utilities.error: ??? "'\\nERROR: BinNormalisation set-up with different ExamInfo.\\n Set_up was with ...

Whilst the only difference is the Time frame start - end (duration), all in secs: field of the output, so I assume that this is saved in some shared place, which all the processes are accessing.

Could anybody give me a hint how to set this up or would this be considered a bug?

Thank you for your help.

@fdellekart
Copy link
Author

Note: The error happens inside the setup of the scatter estimator.

@KrisThielemans
Copy link
Member

Sorry for the delay.

I'm not familiar with Python multi-processing unfortunately. I don't know what it does with variables that go to a C/C++ interface. You'd hope that everything runs in subprocesses and is therefore independent (I don't think subprocesses can communicate), but I don't really know.

As far as I know, neither SIRF nor STIR uses global variables (ok, STIR uses them for its "registries" for projectors/file formats etc, but that's irrelevant here). Possibly @evgueni-ovtchinnikov can comment.

On the other hand, when STIR is compiled with OPENMP=ON (default when using the SuperBuild), a lot of operations will be multi-threaded already. I'd hope that this is now the majority, but possibly not. There's a danger that creating subprocesses AND multi-threading becomes less efficient. It might be good to do some monitoring (without the multiprocessing) to see what the load on your system is during various bits of the processing. That would be useful information for us.

I think using multiprocessing becomes especially problematic when using a GPU (i.e. Parallelproj) . I can't see how this can be done safely with good performance, but again, I don't know.

@fdellekart
Copy link
Author

fdellekart commented Aug 20, 2024

Thank you for the reply. I understand.
AFAICS after doing a bit of monitoring, the load is usually distributed over all CPUs quite well, however, there are still some timeframes where the machine is not under full load. Therefore, I thought that possibly it could be a good idea to run the reconstruction of individual frames in parallel.

In general: Is there a standard API which I should use for dynamic reconstruction apart from the obvious for loop?

@KrisThielemans
Copy link
Member

If you have multiple machines, then you can run individual frames in parallel "by hand" of course.

Is there a standard API which I should use for dynamic reconstruction apart from the obvious for loop?

not yet, unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants