-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove simID
and recID
fields from the *Association types
#51
base: main
Are you sure you want to change the base?
Conversation
The reason these were introduced was also in part because in python it was not possible to load the relations branches with |
I haven't ran into any issues. If you have pointers to any reproducers, I can try. |
I think the main issue was the encoding of the associations, where the index had the format 10000*collectionID+index, which was the not possible to use as a proper array index. That seems fine now: In [7]: print(file['events']['EcalEndcapNHits#0.index'].array())
[[0, 1], [0, 1, 2, 3, 4, 5, 6, ..., 118, 119, 120, 121, 122, 123], ..., [0, 1]] |
And also applied to collections: In [17]: print(file['events']['EcalEndcapPInsertTruthClusterAssociations#0.index'].array())
[[0, 1, 2], [0, 1], [0, 1, 2, 3, ..., 27, 28, 29, 30], ..., [0, 1, 2, 3], [0]]
In [18]: print(file['events']['EcalEndcapPInsertTruthClusterAssociations#1.index'].array())
[[32, 31, 32], [139, 435], [2185, 893, ..., 72], ..., [25, 25, 25, 318], [77]] |
Yes, collection ids and indices are separate branches (in the new frames podio format?) |
I agree this looks good. We can probably proceed as follows:
|
We don't have to rush EICrecon changes ahead of this, I don't mind updating after a delayed review. |
@bspage912 brought up that it's TTreeArray that might not be able to handle pound symbols. The suggestion from @sly2j was to postpone this until a user-friendly approach is found for working with relations (it is appreciated that other relations in EDM4eic do not receive any new "*ID" fields of their own). The users are to be retrained at that later stage. @mdiefent reminds us about principle of user-centric approach. |
I performed a quick check and TTreeReader (used within compiled c++) can handle branch names with pound symbols, so at least that shouldn't be an issue. I also confirm I see the same values in the *#0 and *#1.index fields as are in recID and simID. |
simID was originally introduced in eicd as there was no way to refer to a EDM4hep collection at the time https://github.com/eic/EDM4eic/blame/7299aba54097a3f2a17c0242731c9dabe68fb450/eic_data.yaml#L546-L547 Later, recID was introduced for convenience, and possibly had to be used to hack associations into an early versions of EICrecon. We don't need neither. This information is already available via sim and rec relations.
2b39cb1
to
d3b75aa
Compare
Conflicts resolved, bumped next version to 7.0.0 since API breaking change. |
simID
was originally introduced in eicd as there was no way to refer to aEDM4hep collection at the time
8305134#diff-1d37563b8df596e24b79c561e07c4e0c63e1b501670cf25f22e1ecc21eec273fR532
Later,
recID
was introduced in 2940813, and possibly had to be used tohack associations into an early versions of EICrecon. We don't need neither now: this information is already available via
sim
andrec
relations.