Skip to content

Checking File Generation Scripts Work

Steven Timm edited this page Jul 8, 2022 · 3 revisions

Three of the four np04-srv-xxx machines are programmed via crontab to generate sets of data files at certain intervals using shell scripts. These shell scripts all run as the "np04daq" user. The scripts all live in ~np04daq/dc4/bin. The initial data samples all live in /data0/dc4/sample on the respective machines. In general there are 3 scripts that run.

createDataFile.sh

This script prepends a prefix and a timestamp to each original file name in /data0/dc4/sample and copies it to /data0/dc4. This way all file names are unique. Each run of this script effectively makes a new faux run number.

This script will not run unless there is a lock file touched in /tmp. /tmp is cleaned up on these machines on average about once a week.

createMetadataFile_dc4.sh

This is a modified version of Kurt Biery's script which makes a rudimentary metadata file for each of the files thus created above. Variants of it are needed for each data type because the metadata fields need to be a bit different.

Once the json file with the metadata is there in /data0/dc4, the ingest daemon will see it, arrange to copy the data file and the json to public EOS, and rename the files to *.copied

removal script

This runs once an hour to remove all the *.copied files.

np04-srv-002 Data files are np02_bde_coldbox_run012352*.hdf5 This is run 12352 from the np02 bottom drift electronics cold box. 60 files x 4 GB each

Scripts run in crontab are createDataFile.sh, createMetadataFile_dc4.sh

np04-srv-003 Data files are of form 455_*_cb.test. These are run 455 from the np02 coldbox top drift electronics. They're raw binary files out of the legacy np02 DAQ system. 81 files x 3GB each.

Scripts run in crontab are createDataFile_top.sh, createMetadataFile_dc4_top.sh

np04-srv-001 (will move to np04-srv-004 once it is back) Data files are of form bc38ee1a-3092-441c-9b37-4c106ae5cf48-gen_protodunehd_1GeV_56895279_0_g4_detsim.root and are detsim files of the ProtoDUNE II HD detector. There are 48 unique files in the sample each about 1.3GB, each cloned 4x to make a total sample of ~240GB

Scripts run in crontab are createDataFile_hd.sh, createMetadataFile_dc4_hd.sh

If all these scripts are running successfully, you should see data files getting copied into /data0/dc4, then json files appearing, and then them showing as *.copied.

At full blast these scripts will be running 6x an hour, generating in aggregate 750GB every time they run and 100TB over the course of the 24-hour day. This will make significant I/O load on the machines that will be noticeable since we are actually copying, not symlinking. We may end up making dc4 directories on other data areas besides /data0 to spread out the load.

We are working on getting all the various data challenge operators access to the np04daq account. We don't have it yet.