Skip to content

Merging output coffea files

Alp edited this page Jun 3, 2023 · 3 revisions

Output file merging

To plot results, the output coffea files must be merged into a single accumulator. You can see here for instructions to use jexec to submit jobs and produce coffea files.

The merging is done via the jmerge executable, as follows:

jmerge $indir -o ./path/to/output/directory -j4

In the above command, -o specifies the output directory to save the merged accumulator, and -j specifies the number of parallel jobs to run the merging. $indir is the path to the submission directory which holds the individual .coffea files.

Access in code

Once the merge is done via jmerge, the accumulator is accessed in the code using the klepto library as follows:

from klepto.archives import dir_archive

acc = dir_archive("/path/to/merged/files") # Same as the -o argument to jmerge

And each histogram inside the accumulator can be accessed by first loading a copy into memory via acc.load(), as follows:

# Let's say we want to access the MET histogram which we named as "met"
distribution = "met"

acc.load(distribution)
histo = acc[distribution]

Note that without the first acc.load() call, a direct attempt to access acc[distribution] will give a KeyError.

Clone this wiki locally