DM-41879: Implement URL signing for client/server #920

dhirving · 2023-12-05T00:06:45Z

Requires lsst/resources#73

Checklist

ran Jenkins
added a release note for user-visible changes to doc/changes

codecov · 2023-12-05T22:56:37Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (1e7b397) 87.55% compared to head (a633a1a) 87.60%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #920      +/-   ##
==========================================
+ Coverage   87.55%   87.60%   +0.04%     
==========================================
  Files         295      295              
  Lines       38241    38259      +18     
  Branches     8088     8088              
==========================================
+ Hits        33482    33516      +34     
+ Misses       3553     3539      -14     
+ Partials     1206     1204       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

timj

Some nice clean ups. I agree that checking the file size after download is fine since that's the case that should always happen and the case of a bad size is meant to be very rare (so the cost of downloading the file and immediately raising an exception should not be burdensome).

timj · 2023-12-06T17:49:57Z

python/lsst/daf/butler/datastores/fileDatastore.py

@@ -1998,10 +1998,19 @@ def get(
    def prepare_get_for_external_client(self, ref: DatasetRef) -> FileDatastoreGetPayload:
        # Docstring inherited

+        # 8 hours.  Chosen somewhat arbitrarily -- this is long enough that the


Last time we talked with Russ in Princeton we decided that a few minutes was reasonable here since the signing should only happen close to the getting (and we wouldn't be automatically signing the results from query.datasets()) and the server shouldn't be returning URLs that live long enough for people to post on social media. If the getting of a single dataset is taking hours then we have bigger problems (and I was under the impression that once the download starts it wouldn't cut off the download if it passed expiration but that may be misunderstanding how chunked downloads would work)

It won't cut off the download, but if you need to retry at the HTTP layer it will need the signature to remain valid. Its nice if you can allow HTTP retries without needing to throw an exception all the way up to Butler.get() to re-do the API request too.

Or e.g. for multiple range requests if it does some slow work in between reading the chunks, the signature will need to stay valid for the duration of the work.

Also we had discussed like a get_many(), so the download might not start immediately if there are other files in queue on a slow link.

I would argue that a few minutes isn't really long enough. 10-20 minutes is probably OK, but I think it's best if we can make this the longest possible length that isn't a political problem. How would you feel about 1-2 hours? I think it's best to have timeouts one or two orders of magnitude higher than the worst expected typical case, so things don't blow up immediately when the system is under stress.

Also I really don't buy the social media thing given that we provide a function for downloading the files, which could then be stored wherever you want for as long as you want.

timj · 2023-12-06T18:01:07Z

python/lsst/daf/butler/datastores/file_datastore/get.py

@@ -225,7 +226,7 @@ def _read_artifact_into_memory(

    formatter = getInfo.formatter
    nbytes_max = 10_000_000  # Arbitrary number that we can tune
-    if resource_size <= nbytes_max and formatter.can_read_bytes():
+    if recorded_size >= 0 and recorded_size <= nbytes_max and formatter.can_read_bytes():


Good catch. The size of -1 for "unknown" was added long after this check was written.

timj · 2023-12-06T18:03:51Z

python/lsst/daf/butler/tests/utils.py

@@ -311,4 +315,6 @@ def addDataset(
        if run:
            self.butler.registry.registerCollection(run, type=CollectionType.RUN)
        metric = self._makeExampleMetrics()
-        self.butler.put(metric, self.datasetType if datasetType is None else datasetType, dataId, run=run)
+        return self.butler.put(


Please add a Returns section to the docstring.

timj · 2023-12-06T18:09:22Z

tests/test_server.py

@@ -51,6 +52,7 @@
    removeTestTempDir,
 )
 from lsst.resources.http import HttpResourcePath
+from lsst.resources.s3utils import clean_test_environment_for_s3, getS3Client


S3 dependencies are not installed by default in the pyproject.toml definition so if server testing is now required to have boto3 available this is going to have to have import protection so the tests can be skipped.

timj · 2023-12-06T18:11:31Z

tests/test_server.py

+def _create_corrupted_dataset(repo: MetricTestRepo) -> DatasetRef:
+    run = "corrupted-run"
+    ref = repo.addDataset({"instrument": "DummyCamComp", "visit": 423}, run=run)
+    uris = repo.butler.getURIs(ref, run=run)


I'm hoping that the run parameter is not needed here because the ref knows its own run.

Deduplicated some environment cleanup logic between resources and daf_butler

Butler server will initially only support returning HTTP URLs to the client via signing of S3 URLs, so our tests need to use an S3 root for its DataStore.

Move the resource size check when retrieving artifacts so it uses the downloaded data size, instead of checking the size separately. HEAD is not supported on S3 URLs presigned for GET, so HttpResourcePath.size() does not work for them. This is also a slight optimization, since it saves an unnecessary network round trip.

In the currently planned use case for Butler client/server, the server uses S3 for storing artifacts and gives clients access to them by presigning URLs.

dhirving force-pushed the tickets/DM-41879 branch from 5def122 to 59d1e8b Compare December 5, 2023 21:04

dhirving changed the base branch from main to tickets/DM-41878 December 5, 2023 22:24

dhirving force-pushed the tickets/DM-41879 branch from 24ac2c9 to 8ab6368 Compare December 6, 2023 17:16

dhirving marked this pull request as ready for review December 6, 2023 17:33

timj approved these changes Dec 6, 2023

View reviewed changes

dhirving force-pushed the tickets/DM-41878 branch 2 times, most recently from 9ef02ce to d718a09 Compare December 7, 2023 19:40

dhirving force-pushed the tickets/DM-41879 branch 2 times, most recently from 7af2210 to 55d0b33 Compare December 7, 2023 23:51

Base automatically changed from tickets/DM-41878 to main December 8, 2023 16:25

dhirving added 3 commits December 8, 2023 09:31

Deduplicate S3 test cleanup in test_butler

f0c0561

Deduplicated some environment cleanup logic between resources and daf_butler

Use mocked S3 file storage in butler server tests

f90f012

Butler server will initially only support returning HTTP URLs to the client via signing of S3 URLs, so our tests need to use an S3 root for its DataStore.

dhirving force-pushed the tickets/DM-41879 branch from 55d0b33 to e6639aa Compare December 8, 2023 16:31

Send S3 presigned URLs to the client

a633a1a

In the currently planned use case for Butler client/server, the server uses S3 for storing artifacts and gives clients access to them by presigning URLs.

dhirving force-pushed the tickets/DM-41879 branch from e6639aa to a633a1a Compare December 8, 2023 19:55

dhirving merged commit d550cd6 into main Dec 8, 2023
17 checks passed

dhirving deleted the tickets/DM-41879 branch December 8, 2023 20:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-41879: Implement URL signing for client/server #920

DM-41879: Implement URL signing for client/server #920

dhirving commented Dec 5, 2023 •

edited

Loading

codecov bot commented Dec 5, 2023 •

edited

Loading

timj left a comment

timj Dec 6, 2023

dhirving Dec 6, 2023

timj Dec 6, 2023

timj Dec 6, 2023

timj Dec 6, 2023

timj Dec 6, 2023

DM-41879: Implement URL signing for client/server #920

DM-41879: Implement URL signing for client/server #920

Conversation

dhirving commented Dec 5, 2023 • edited Loading

Checklist

codecov bot commented Dec 5, 2023 • edited Loading

Codecov Report

timj left a comment

Choose a reason for hiding this comment

timj Dec 6, 2023

Choose a reason for hiding this comment

dhirving Dec 6, 2023

Choose a reason for hiding this comment

timj Dec 6, 2023

Choose a reason for hiding this comment

timj Dec 6, 2023

Choose a reason for hiding this comment

timj Dec 6, 2023

Choose a reason for hiding this comment

timj Dec 6, 2023

Choose a reason for hiding this comment

dhirving commented Dec 5, 2023 •

edited

Loading

codecov bot commented Dec 5, 2023 •

edited

Loading