Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENG-4624] [S3 Improvements] Project PR - OSF Backend Part #10416

Merged
merged 16 commits into from
Aug 17, 2023

Conversation

cslzchen
Copy link
Collaborator

@cslzchen cslzchen commented Jul 12, 2023

Purpose

Enable OSF users to choose subfolders as add-on storage root in project for S3.

Credit @Johnetordoff for all the work 👍

Project notion: https://www.notion.so/cos/S3-improvements-476a5b07cb7e4f458e9cd4c77cfa03ec
WB part: CenterForOpenScience/waterbutler#406

Changes

  • Enables users to configure subfolder as add-on root for S3
  • Uses boto3 to get S3 buckets
  • Uses :/ as the new delimiter for S3 bucket
  • Added a management command to migrate existing configuration to use the :/ delimiter
  • Added/Updated unit tests

DevOps Notes

  • Deployment requires downtime for two reasons
    • Must be deployed at the same time as WB
    • Must run the management command before any user configures an S3 bucket
  • For details, see our release playbook (TBD)

Dev Notes

Here are all child PRs that have been merged into this feature branch.

QA Notes

See QA docs in the project notion page

Documentation

Update our user guide for using S3 (TBD)

Side Effects

  • Our S3 accounts used for testing is and U.S. account, which has limitation on the enabled buckets.
  • With boto3, there are new bucket regions that cannot be handled by WB. With our U.S. account, there are 3 of them (Asia Pacific (Osaka) ap-northeast-3, EU (Paris) eu-west-3 and EU (Stockholm) eu-north-1.)

Ticket

https://openscience.atlassian.net/browse/ENG-4624

addons/s3/utils.py Outdated Show resolved Hide resolved
@cslzchen cslzchen changed the title Reference PR: S3 Improvements (feature -> develop) [S3 Improvements] Reference PR: feature -> develop Jul 14, 2023
@@ -78,29 +87,43 @@ def set_folder(self, folder_id, auth):
self.folder_name = '{} ({})'.format(folder_id, bucket_location)
self.save()

self.nodelogger.log(action='bucket_linked', extra={'bucket': str(folder_id)}, save=True)
Copy link
Contributor

@Johnetordoff Johnetordoff Jul 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfraezz I was wondering what you think about this, obviously I want to avoid migrating the logs, I figure if this is purely additive (the bucket name will stay the same string.) we can add path a new thing without causing discontinuous historical data. I feel just adding something to the logs like this is appropriate if it's in extra.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a Product decision whether we want to display folder info in these logs. Presently, it will maintain existing behavior and just say linked the Amazon S3 bucket ${bucket} sans any folder path info, which is fine if Product signs off on it.

@cslzchen cslzchen changed the title [S3 Improvements] Reference PR: feature -> develop [ENG-4624] [S3 Improvements] Project PR - OSF Backend Part Aug 7, 2023
@cslzchen cslzchen marked this pull request as ready for review August 8, 2023 12:07
@cslzchen cslzchen force-pushed the feature/s3-improvements branch 2 times, most recently from 129f8ec to 56d258e Compare August 8, 2023 12:24
Copy link
Member

@mfraezz mfraezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small request, one Minor open question. Otherwise LGTM :octocat:

addons/s3/utils.py Outdated Show resolved Hide resolved
@@ -135,10 +158,16 @@ def serialize_waterbutler_credentials(self):
}

def serialize_waterbutler_settings(self):
"""
We use the folder id to hold the bucket location
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, and I don't really know the best place for this one, but related to the comment:

Using folder_id for the bucket and folder_path for the... folder path would be more consistent other addon behavior. What was the rationale for not doing so -- avoiding a migration?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes avoiding the migration was the rational.

@mfraezz mfraezz merged commit f0323f9 into develop Aug 17, 2023
12 checks passed
@cslzchen cslzchen deleted the feature/s3-improvements branch September 6, 2024 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants