Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔥 Issue with downloading data using query #6

Open
Brookluo opened this issue May 23, 2023 · 2 comments
Open

🔥 Issue with downloading data using query #6

Brookluo opened this issue May 23, 2023 · 2 comments

Comments

@Brookluo
Copy link

Brookluo commented May 23, 2023

Hello,

I was trying to download SAGE imaging data using the query function. I tried to query as much data as possible with an early starting date, and I didn't specify the end date in the query so that the script will download all data available till today. However, I only downloaded a fraction of all available data. There were only 1220 rows in the data frame returned. The time span of the data downloaded is very short, which you can see from the image I attached.
Screenshot 2023-05-23 at 1 55 41 PM

My query is

df = sage_data_client.query(
    start="2023-01-01T00:00:00",
    # end="2023-05-22T23:51:36.246454082Z",
    filter={
        "plugin": "*mobotix-scan*",
        # "vsn" : "W071"
    }
)

These images were from only four nodes ('V023', 'V032', 'W056', 'W057'). Again, these are not all nodes available, as shown on the SAGE website with specific plugins. The link is here. https://portal.sagecontinuum.org/query-browser?apps=registry.sagecontinuum.org%2Fbhupendraraut%2Fmobotix-scan%3A0.23.4.24

After some further research, we found that imaging data is from physically present nodes in Argonne. Could there be some issues with the code or internet connection?

I downloaded the sage-data-client package from PyPI, and the version is 0.5.0.post1. Please let me know if you need additional information.

Thanks,
Yufeng

@Brookluo
Copy link
Author

Brookluo commented May 23, 2023

Bhupendra tested this code snippet, and it downloaded the expected data successfully.

df = sage_data_client.query(
    start="20230401-20:25:00",
    end="20230405-20:25:10",
    filter={
        "plugin": "*mobotix-scan.*",
         "name": "upload"
        #"vsn" : "V008"
    }
)

It seems the only difference is the format of the timestamp string.

@seanshahkarami
Copy link
Collaborator

Thanks for sharing!

After looking more closely, this may actually be related to how the wildcard matching is working. Notice that in one case you're filtering on *mobotix-scan* and in the other *mobotix-scan.*.

I'll double check to make sure something unexpected isn't happening when that's compiled to a regex internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants