-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add aws sts library to support IRSA for logging #263
Conversation
@bothra90 -- FYI. You mentioned here #177 (comment) that you had to make some changes and were able to get things working. Does this look right to you? Did I miss anything? |
cc @pmossman -- Thanks for the help with the previous PR; FYI for this follow on. |
@msaffitz: Here's a git patch for what I was able to get working: https://gist.github.com/bothra90/62403b32449c90807ee4b8ec9c14f862. Your PR is in particular missing the necessary changes to |
Thanks @bothra90 . With removing the values from (I'm curious about the error you saw with leaving the values in the XML file -- I'd think the null / empty string values here would both be ok? Unfortunately I don't have a good way to test this locally at the moment as our AWS stuff is only configured to pull from official sources). |
Yes, I tested that logging works with the values removed. |
Awesome, thanks for confirming. I'll update this PR this evening with those additional changes. |
@@ -12,8 +12,6 @@ | |||
|
|||
<Property name="s3-bucket">${sys:S3_LOG_BUCKET:-${env:S3_LOG_BUCKET}}</Property> | |||
<Property name="s3-region">${sys:S3_LOG_BUCKET_REGION:-${env:S3_LOG_BUCKET_REGION}}</Property> | |||
<Property name="s3-aws-key">${sys:AWS_ACCESS_KEY_ID:-${env:AWS_ACCESS_KEY_ID}}</Property> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msaffitz can you explain why the removal of these lines are necessary? What happens if you introduce the sts
library without removing these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not actually sure; I had the same question above (#263 (comment)) and @bothra90 indicated that removing them was necessary to get things functioning. @bothra90 -- do you have more context on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was unable to create a source from the console and seeing the following error log in airbyte-server
process:
Cannot end publish with com.van.logging.aws.S3PublishHelper@53f4b9a9 due to error:
Cannot end publishing: Cannot publish to S3: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code:
InvalidAccessKeyId; Request ID: TY5JJZ76SMD9BA86; S3 Extended Request ID: WEK4bCkgavHUqbgoywsSNVMwBt14/ke89r9mC/usS879oBQ1MYNw+S/2urb3xedmfWxWBpfj90TMLqnYL+hCfw==; Proxy: null)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed this internally with some folks that have more familiarity around our Log4J2 setup. Simply removing these properties won't work -- we use an underlying log4j-appender implementation that will need to support IRSA.
We opened an issue for this in that project so we'll wait and see what the maintainer says.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, it did the job for me. Happy to jump on a quick Slack huddle and demo it to you or @davinchia
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears that the underlying appender library that I linked above actually supports roles by default. The s3-aws-key
and s3-aws-secret
properties are optional, for situations where role-based credentials aren't available. That said, I'm concerned that if we simply delete these properties from the xml configuration, we could break Airbyte installations that are using those properties instead of roles.
An ideal solution would be backwards compatible for such installations, though I'm not sure what that would look like, since it seems like as soon as these tags are defined, that appender will try to use them, even if the value is blank
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the concern. Ultimately though, the passed parameters get used here and if the s3AwsSecret
and s3AwsKey
are not provided, the library will use the default AWS providers chain which will correctly use AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
. This is how these variables in the configuration are populated in the first place, so I don't suspect this would be an issue.
Having said that, I can try to test this out on my end and make sure nothing breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seconded @bothra90's comment. Given that Airbyte is already following the standard naming conversion for storing AWS keys which is how it's currently being passed as environment variables, I also don't see any issue with letting the DefaultAWSCredentialsProviderChain
handle it instead of being passed explicitly.
Using the ProviderChain would be backwards compatible with existing setup as any existing or new credentials will just be picked up according to their priority - in this case ENV first followed by roles / instance profile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bothra90 @weichunnn thanks for pointing this out, and apologies for the slow-moving back and forth here. I agree that it seems like these XML properties look unnecessary. @bothra90 if you're able to test and verify that this works as expected on your end, I'd love to see your results.
Otherwise, I'll open this PR against our internal repo where I can run our full suite of tests. Thanks for bearing with me here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pmossman! Checking in to see if there are any updates on the review internally? 👀
/create-oss-pr |
@msaffitz @bothra90 @weichunnn I was able to test these changes with accessKey/secretKey configuration and verified it's all working as expected. The internal PR is merged, so I am going to close this PR. Thank you all for your engagement and contribution! |
What / How
Adds libs.aws.java.sdk.sts to the classpath to expose WebIdentityTokenFileCredentialsProvider for use with IRSA (IAM Roles for Service Accounts) for logging to S3. This (hopefully!) will enable use of S3 without providing keys. The docs on this for v1 of the SDK are little ambiguous; the v2 docs are much clearer. See also the discussion here
This builds on the previous work done in #7766 and supersedes #177.
Can this PR be safely reverted / rolled back?
🚨 User Impact 🚨
No known user impact.