Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow projections in find headers #36

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vshekar
Copy link
Collaborator

@vshekar vshekar commented Oct 10, 2024

This PR is specifically meant for the way data is stored in the MX beamlines, it can be expanded to other collections if needed.

The MX beamlines use analysis store headers to store the results of rasters. An example of a raster result is as follows:

{
    "time": 1722530001.5539453,
    "uid": "595e358c-b1a6-4d99-ac1b-70a92937cb77",
    "provenance": {
        "lsdc": 1
    },
    "result_type": "rasterResult",
    "owner": "lsdc-amx",
    "sample": "b8086557-e2cc-4466-994c-137ce90fafa4",
    "request": "924916a6-7530-424a-861a-9078c7e94985",
    "result_obj": {
        "sample_id": "b8086557-e2cc-4466-9e4c-127ce90fafa4",
        "parentReqID": "a39bfafd-2398-49db-9e40-977f07be4c32",
        "rasterCellMap": {
            "cellMap_1": {
                "x": 5662.5,
                "y": 700.9384045160338,
                "z": 233.98982510854614
            }, ... <hundreds more cellMap co-ordinates>
        }
        "rasterCellResults": {
                    "type": "dialsRasterResult",
                    "resultObj": [
                        {
                            "image": [
                                "/path/to/master.h5",
                                1
                            ],
                            "spot_count": 0.0,
                            "spot_count_no_ice": 0.0,
                            "d_min": 50.0,
                            "d_min_method_1": 50.0,
                            "d_min_method_2": 50.0,
                            "total_intensity": 0.0,
                            "cellMapKey": "cellMap_1"
                        }, ... <hundreds more cell results>
        "proposalID": "312346",
        "beamline": "amx"

For context, dumping a raster data like above with 195 cells into a file with indent=4 takes around 3500 lines. Not all of the data is necessary for purposes of reconstructing the raster. Specifically, there is no need for "rasterCellMap", or data within "rasterCellResults.resultObj" for certain situations. By using projections, one can remove keys like "rasterCellMap" and potentially reduce the processing time to get raster data, especially if we filter a large number of rasters in 1 query.
Removing keys from within "rasterCellResults" is not straightforward but could be, by changing the structure of data. But that is a PR for another repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants