Pulseq loading slow and potentially incorrect block ordering #224

gabuzi · 2023-11-20T12:10:06Z

Hello, me again :)

Two aspects on Pulseq that I have noticed:

1. Speed

I have been toying a bit more with Koma, now moving to even larger pulseq files. Pulseq loading is extremely slow for those (does not finish within 20min), and @time read_seq(...) indicates a large amount of allocations.

Let's take as an example a toy cartesian 2D bSSFP sequence with 64 phase encode steps and 2 excitations per step. The result has 768 blocks, the pulseq file is 27KB in size.

Reading this with the latest master ae27d17, gives

julia> time_read_seq(f) = @time read_seq(f)
       println("baseline")
       seq = time_read_seq(seqfile)
baseline
[ Info: Loading sequence bSSFP_FA30deg_TE10ms_TR20ms_2D_(69x64)_pulseq.seq ...
  0.752135 seconds (8.51 M allocations: 299.462 MiB, 11.12% gc time)
Sequence[ τ = 3539.2 ms | blocks: 768 | ADC: 128 | GR: 636 | RF: 128 | DEF: 11 ]

julia>

We can see 8.5M allocations totaling 300MB and a ballpark of 0.5-1s for loading (have warmed-up the function before to force compilation).
A larger sequence file (6912 blocks), 220KB pulseq file comes in at ca 50s time and 21GiB allocations.
Extrapolating this to, say an MRF sequence, where you could easily have two to three orders of magnitude more blocks could easily make pulseq sequence loading completely infeasible (I have tried that). This was with an M1 macbook pro.

I have spent some time digging into the sequence loading code and optimizing it. I went through several iterations and I can now load the small file above in 0.027s with 300k allocations totalling in 11.6MB.

The larger file now loads in 0.25s with 2.59M allocations for 98MiB.

2. [BLOCKS] order

Along the way I realized that Koma seems to interpret the pulseq [BLOCKS] section as being reproduced on a scanner in incrementing ID order instead of line by line in [BLOCKS]. The blocks data is read into dictionaries of (ID => block_data), and then the sequence struct is assembled by iterating over incrementing IDs.
See https://github.com/cncastillo/KomaMRI.jl/blob/ae27d17326b7989d7fc1aceba01a051450e3d219/KomaMRICore/src/io/Pulseq.jl#L412
(which coincidentally is one of the main culprit lines for the slow load),
and then in get_block, we have: https://github.com/cncastillo/KomaMRI.jl/blob/ae27d17326b7989d7fc1aceba01a051450e3d219/KomaMRICore/src/io/Pulseq.jl#L642
obj["blockEvents"][i] is the mapping from block with ID==i to the data from the [BLOCKS] section for this ID.

Digging through the pulseq spec 1.4.1 (my seqs are 1.4.0, but there were no relevant changes from 1.4.1 to 1.4.0): https://pulseq.github.io/specification.pdf, we find sec 2.2:

There are no restrictions on the ordering of IDs, although sequential ordering is often implemented.

And in sec 2.7, p10 top

The sequence must declare at least one block. Any non-zero number of blocks
may be defined. The blocks are executed sequentially.

While this may be seen as a bit ambiguous ("sequentially line-by-line or by-id?"), I think together with sec 2.2, I would say it's line-by-line.

This is good, because it relieves Koma from storing the BLOCKS into a Dict and relatively expensive indirect lookups, but rather the entire [BLOCKS] section can be read into one big Int array that doesn't have to be modified, which is significantly more efficient for creation and access.

This is more or less how I achieved the optimizations described above.

3. What's next?

Since these are more invasive modifications (I had to define the == operator for sequences to ensure that the results I get form different reading functions are the same), and I could only test on my pulseq files. These are all v.1.4.0 and created via pypulseq and thus probably lack the diversity allowed by the specification. So, the obvious question is: are there pulseq test files available in Koma already? I didn't find any.

I'm happy to open a PR with the optimizations so that you can have a closer look. Let me know what you think.

The text was updated successfully, but these errors were encountered:

gabuzi · 2023-11-25T18:18:14Z

At some point I did get a github notification with sequences on the writer branch, but interestingly, the comment doesn't show up here. I'll have a look at those for further testing.

cncastillo · 2023-11-28T15:28:19Z

Hi @gabuzi. Thank you so much for your efforts 😄! It is very nice to see people other than me and Boris contributing to the project.

We have some Pulseq files that test different relevant aspects (I changed the extension .seq to .seq.txt so I can upload them here):

Hope this is useful.

@beorostica we should add some tests to #231 compare our k-space with MATLAB's Pulseq k-space for these examples, to ensure we are reading the same waveforms.

gabuzi · 2023-11-28T17:36:32Z

@cncastillo No worries! To me, that's how open-source software works -- happy to give something back!

Thanks for the examples! At the moment, I am a bit short on time, unfortunately. But I came across the python implementation for pulseq (pypulseq), and I realized that it uses a very similar approach for sequence reading. I didn't think that was coincidental, so I looked at the upstream pulseq matlab implementation, which turned out to be very similar as well :)

Consequently, all implementations suffer the same ambiguity in block ordering, for which I have now opened an issue in the matlab pulseq repo about it: pulseq/pulseq#40

cncastillo · 2023-11-28T18:19:33Z

Yeah it makes sense, the Pulseq reader in Koma is pretty much a copy-paste of Pulseq's MATLAB reader (v1.4.0 at least). It will be interesting to see what they think.

In the meantime, if you are busy, we can test the performance improvements. We would just require a draft pull request.

cncastillo · 2024-04-08T04:17:37Z

I looked at this again using @profview, and I found the same sources of slow down you mentioned. The culprits were type instability in get_blocks and read_blocks, the flip angle calculation, the Sequence constructor being slow, and the use of scanf in some parts. I fixed them in 6176d03.

The current master:

New version from #321

So now the reader is 20 times faster! 🥳 Thanks for detecting the problem and the great progress on the solution in #238 😄

In #321 we are also including some tests to be sure that we are reading the same as MATLAB's Pulseq (v1.4.X), but we still need to add some more for PyPulseq (v1.3.X) #318.

Cheers!

gabuzi · 2024-04-08T09:35:56Z

This is great news! 20x is a great speedup 🚀

I'm just noting this here as it might have gone under in my relatively big opening post for this issue. The numbers I have obtained back then indicate that another factor of 10x should be possible. My benchs using #238 saw 50s -> 0.25s for a seq with 6912 blocks. But then again, other bits that influence loading performance might have changed.

Arguably, the 20x speedup is enough for for most standard rapid imaging sequences, but longer ones like 3D MRF without undersampling (e.g. for debugging) could become a patience tester ;)

cncastillo · 2024-04-08T12:13:59Z

Nice! I am all for speed improvements. I will reopen this so I don't forget to look at this on the weekend. The difference in speed could be related to the use of the dictionaries. I will need to investigate.

Also related to performance, we recently enabled a new gl option for plot_seq that should make the plots much faster #364. In the UI this option is automatically enabled if the sequence has more than 1000 blocks.

gabuzi · 2024-04-08T12:25:50Z

From memory, I am quite certain that it was indeed due to dictionaries. IIRC, in #238 I avoided those to get those further speedups. If I'm not mistaken, this did come with the caveat that sequence blocks were then ordered by order as they appear in the file, not by their ID (which is fine according to the pulseq standard, see #224 (comment), but crucially is not what the pulseq matlab reference implementation does!).

Also related to performance, we recently enabled a new gl option for plot_seq that should make the plots much faster #364. In the UI this option is automatically enabled if the sequence has more than 1000 blocks.

This is great to hear! I remember plots being an issue for those large sequences.

cncastillo added the 1) high priority label Nov 21, 2023

cncastillo assigned beorostica Nov 28, 2023

gabuzi mentioned this issue Dec 3, 2023

Load pulseq sequences faster and allow comparison #238

Draft

beorostica mentioned this issue Jan 23, 2024

Optimize the process of reading .seq files #301

Closed

cncastillo mentioned this issue Mar 12, 2024

Incorrect gradient interpretation when waveforms do not end in zero #321

Merged

cncastillo linked a pull request Apr 8, 2024 that will close this issue

Incorrect gradient interpretation when waveforms do not end in zero #321

Merged

cncastillo added a commit that referenced this issue Apr 8, 2024

KomaFiles Pulseq read optimization and type stability #224

6176d03

cncastillo closed this as completed in #321 Apr 8, 2024

cncastillo reopened this Apr 8, 2024

cncastillo added the performance label May 31, 2024

cncastillo assigned cncastillo and unassigned beorostica Sep 2, 2024

cncastillo added 2) medium priority and removed 1) high priority labels Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pulseq loading slow and potentially incorrect block ordering #224

Pulseq loading slow and potentially incorrect block ordering #224

gabuzi commented Nov 20, 2023

gabuzi commented Nov 25, 2023

cncastillo commented Nov 28, 2023 •

edited

Loading

gabuzi commented Nov 28, 2023

cncastillo commented Nov 28, 2023

cncastillo commented Apr 8, 2024

gabuzi commented Apr 8, 2024

cncastillo commented Apr 8, 2024

gabuzi commented Apr 8, 2024

Pulseq loading slow and potentially incorrect block ordering #224

Pulseq loading slow and potentially incorrect block ordering #224

Comments

gabuzi commented Nov 20, 2023

1. Speed

2. [BLOCKS] order

3. What's next?

gabuzi commented Nov 25, 2023

cncastillo commented Nov 28, 2023 • edited Loading

gabuzi commented Nov 28, 2023

cncastillo commented Nov 28, 2023

cncastillo commented Apr 8, 2024

gabuzi commented Apr 8, 2024

cncastillo commented Apr 8, 2024

gabuzi commented Apr 8, 2024

cncastillo commented Nov 28, 2023 •

edited

Loading