-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add true position and E to hits dataset #127
Conversation
YifanC
commented
May 24, 2024
- Added true drift position and true energy reconstructed hits.
- Adapted to use configuration instead hardcoded numbers for hit reconstruction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these changes look good to me, except for the namesake of this PR (adding truth information to the hits dataset), but it would be good to see some validation plots before merging this in.
@@ -161,6 +161,33 @@ def run(self, source_name, source_slice, cache): | |||
|
|||
has_mc_truth = packet_seg_bt is not None | |||
|
|||
if has_mc_truth: | |||
self.calib_hits_dtype = np.dtype(self.calib_hits_dtype.descr + [('x_true_seg_t', 'f8'), ('E_true_recomb_elife', 'f8')]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rational for this? I don't think we should be mixing truth and reco information in the same dataset. This is what the back tracking datasets are for, no? We should also try and keep the "reco" datasets consistent for mc and data..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hits with true t0 position is required for MLreco label making. Storing all contributions with the new commit (also corrected a unit issue). Given our setup, the output recombination (close to 0.7), lifetime (close to 1), E values (~0.5) all make sense to me. The differences between x
and x_true_seg_t
, E
and E_true_recomb_elife
also fits the scale and is consistent with the making.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truth information should go into the backtracking and truth datasets, not the reco datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make more concrete suggestions how to structure this? The current proposal makes sense to me in a way that these are extension of what calib_prompt_hits have. Especially E_true_recomb_elife
is half true half readout. Putting in mc_truth
requires backtracking, not straightforward matching. It doesn't affect the processing of the dataset and the name of the variable is clear about that it uses truth information.