Skip to content
This repository has been archived by the owner on Aug 8, 2024. It is now read-only.

Query using meta-data #13

Open
adelavega opened this issue Dec 5, 2022 · 10 comments
Open

Query using meta-data #13

adelavega opened this issue Dec 5, 2022 · 10 comments
Assignees
Labels
ancp effort: medium Estimated medium effort task good first issue Good for newcomers

Comments

@adelavega
Copy link
Collaborator

I believe ancpbids does not allow querying by meta-data keys, only entities.

@erdalkaraca if this is correct, we also need to address this, as many downstream pipeline use this functionality

@erdalkaraca
Copy link
Collaborator

I did not encounter a use case in the pybids test suite to show that requirement. Or do you mean BIDSLayout.get_metadata()? That is already available.

Can you provide an example on how to query by metadata keys?

@adelavega
Copy link
Collaborator Author

If BIDSLayout(index_metadata=True), all meta-data key-value pairs are available for querying files.

e.g.:

layout = BIDSLayout('bids/tests/data/7t_trt/')
layout.get(RepetitionTime=3.0)

Returns 40 files.

You are correct this is not in the tests, which I'm surprised about and is a definite gap in our tests.
Pybids is much faster when index_metadata=True btw, although I believe its still much slower than ancpbids.

@erdalkaraca
Copy link
Collaborator

I see, maybe that is also the reason why get_entities() returns metadata keys or you can query by metadata keys because pybids mixes entity keys and metadata keys internally.

Will have to do some research to make this work.

@erdalkaraca erdalkaraca self-assigned this Dec 7, 2022
@adelavega
Copy link
Collaborator Author

That is correct. I'm open to thinking about how this could be better implemented.

However, I think that the use case of filtering by meta-data is widely used, and there is a fair amount of demand for it.

@erdalkaraca
Copy link
Collaborator

erdalkaraca commented Dec 7, 2022

what about this?

layout.get(sub="01", metadata={RepetitionTime:[1.0, 2.0], EchoTime: 3.5})

-> keep filters for entities and metadata separate
-> should already be possible using the internal query language but not yet implemented in BIDSLayout

implementation wise this might induce some performance penalties because of the inheritance principle of metadata files, i.e. parsing files from leaves up to dataset level... but once the metadata for an image file is cached, should not be an issue anymore

@adelavega
Copy link
Collaborator Author

That looks good to me in principle, but it will be backwards breaking. We could add a tempoary "shim" that converts any non entity query parameters to the metadata field, to sustain temporary backwards compatibility.

WDYT @effigies

@adelavega
Copy link
Collaborator Author

I would also be okay with breaking backwards comparability, it makes sense to me to not mix entities and meta-data

@erdalkaraca erdalkaraca added effort: medium Estimated medium effort task good first issue Good for newcomers labels Dec 7, 2022
@effigies
Copy link
Collaborator

effigies commented Dec 7, 2022

I think our shim can reasonably do the translation, and we can emit a warning to say that it will need to be explicit in the future.

@erdalkaraca
Copy link
Collaborator

erdalkaraca commented Dec 7, 2022

So, the above example would look like:

layout.get(sub="01", RepetitionTime=[1.0, 2.0], EchoTime=3.5)

I am fine with that as well. This might also be more convenient for users as no embedded dicts involved as parameters.

@adelavega
Copy link
Collaborator Author

Yes, this is consistent w/ the current API (although I'm not sure if lists are allowed in a query-- but that seems like a good thing)

@adelavega adelavega added the ancp label Dec 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ancp effort: medium Estimated medium effort task good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants