-
Notifications
You must be signed in to change notification settings - Fork 5
Tutorial for ProcessND.py
This is a tutorial for ProcessND.py, but it is outdated. The official way of making samples is on nersc. Please talk to #nd_production or #nd_reco_sim on slack for more information about that.
Generating your own samples should always be a last resort. Chances are that there's already a sample that can meet your needs. Or that others have similar needs and we should make a combined sample that meets all needs.
Running a large sample is expensive. All samples should be processed with permission from #nd_production and/or #nd_reco_sim. They should be kept in the loop in all cases. If they don't know what you're up to, you probably shouldn't be running on the grid.
Running a large sample and then needing to scrap is a waste of resources. Make sure all the files are as you expect them. Unfortunately, the sometimes the errors are not evident. For example, there will always be an edep file even if the genie file crashed. This can confuse running tms on the edep file, because the file is empty of events. The log files too will sometimes have multiple errors because nothing is checking if the previous step completed successfully.
The AL9 transition is coming soon and we already are seeing problems with these scripts. Please use at your own risk. For now, running in a singularity container seems to work.
--singularity-image /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-wn-sl7:latest
Make sure the are somewhere where everyone can use them. Also make sure what is in the files is well documented. Like the individual settings
- In some cases, waiting for a nersc sample might take too long. Especially if just rerunning the dune-tms code, the turnaround is much faster running quickly on the grid
- Some functionality (like pileup) is not yet perfectly replicated nersc sample, at least for the dune-tms code
source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
cd <dir with your local copy of ND_Production>
cd scripts # For ProcessND.py
ProcessND.py
takes in a bunch of parameters and then creates processnd.sh
. This is the actual script that is submitted to the grid with jobsub_submit
.
-N
- The number of files. In case of generation, this represents the final number of files. So if each file is 1e15 pot, and you run 1000 files, then the total pot would be 1e18. If using an indir with existing files, -N
should be set to that number of files. Each file is generated on another grid node.
--memory=4000MB
- This is a good starting point. In case of using overlay, this needs to be increased.
--expected-lifetime=24h
- 24h is probably too long, but running all stages with 1e16 pot takes a bit of time. Running a tms-only sample is < 1hr usually. The closer to the actual time this is set to, the higher chance a job has at finding a node. Ie. the quicker things get processed.
--tar_file_name
- This should point to the dune-tms tar file. Eventually this will be its own product but for now all the code needed to run dune-tms lives in its tar file.
tar -czvf dune-tms.tar.gz dune-tms
mv dune-tms.tar.gz /pnfs/dune/persistent/users/kleykamp/dune-tms_tarfiles/2024-04-18_add_truth_info_test.tar.gz
Please make sure to at least tag the version of dune-tms you used, so that we have some reproducability. Here's an example corresponding to the tar file above:
git tag kleykamp_2024-04-18_add_truth_info_test
git push origin kleykamp_2024-04-18_add_truth_info_test
Ideally we all use the same release. Talk to #nd_muon_spectrometer or #nd_muon_spectrometer_code if unsure. It's unlikely this tutorial will have the latest version.
python ProcessND.py --stages tmsreco --indir /pnfs/dune/persistent/users/abooth/Production/MiniProdN1p2-v1r1/run-spill-build/output/MiniProdN1p2_NDLAr_1E19_RHC.spill/EDEPSIM_SPILLS/00000/ --outdir /pnfs/dune/scratch/users/kleykamp/nd_production/2024-04-11_add_truth_info_test
jobsub_submit --group dune --role=Analysis -N 100 --singularity-image /cvmfs/singularity.opensciencegrid.org/fermilab/fnal-wn-sl7:latest --expected-lifetime=24h --append_condor_requirements='(TARGET.HAS_CVMFS_dune_osgstorage_org==true)' --memory=4000MB --tar_file_name dropbox:///pnfs/dune/persistent/users/kleykamp/dune-tms_tarfiles/2024-04-11_add_truth_info_test.tar.gz file://processnd.sh
python ProcessND.py --outdir /pnfs/dune/scratch/users/kleykamp/nd_production/2024-04-11_add_truth_info_test_with_overlay --geometry_location /pnfs/dune/persistent/physicsgroups/dunendsim/geometries/TDR_Production_geometry_v_1.0.3/nd_hall_with_lar_tms_sand_TDR_Production_geometry_v_1.0.3.gdml --manual_geometry_override nd_hall_with_lar_tms_sand_TDR_Production_geometry_v_1.0.3.gdml --topvol volDetEnclosure --pot 1e15 --stages gen+g4+tmsreco
Then run jobsub_submit. This adds gen+g4
stages, which are genie (neutrino interaction simulation) and edep_sim (particle through detector simulation). In this case, we need to point to the correct version of the geometry, which is geometry_v_1.0.3
as of 2024-04-17 (always check that this is still true). We first point to the location on pnfs where it's copied from, and then the name. --topvol volDetEnclosure
is so that it only simulates events in LAr, TMS, and Sand. It also simulates some of the crane, elevator, and egress hallway. But if you use volWorld
, it would simulate in the whole rock detector which is very inefficient (our modern code uses a more efficient rockbox technique)
- Run only genie and edep sim. Use
--stages gen+g4
. When doing this, we no longer need the--tar_file_name
parameter, which holds only the dune-tms code - Run LAr-only - Use
--topvol volLArBath
. You can look at the geometry files through our geometry viewer here. You can find out more about geometry on the #nd_geometry slack channel
Add --timing spill --overlay
. Also increase the memory requirements to 8000mb.
We can process with off-axis fluxes using the -oa
(off-axis) option, which takes in the number of meters off axis. This also requires that you use --dk2nu
because only dk2nu flux files have the full neutrino information in them. The gsimple
files used by default "flatten down" the dk2nu files so that it gives you the flux assuming 0m off-axis detector, and nothing else. So those files are not suitable.
It is possible to run with alternate B-fields, but I'm not sure how. I don't think the existing ProcessND.py handles it correctly, at least I haven't seen it validated.