Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add aws sts library to support IRSA for logging #263

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion airbyte-commons-worker/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ dependencies {
annotationProcessor libs.lombok
implementation libs.bundles.apache
implementation libs.bundles.log4j
implementation libs.aws.java.sdk.s3
implementation libs.google.cloud.storage
implementation libs.aws.java.sdk.s3
implementation libs.aws.java.sdk.sts
implementation libs.s3

implementation project(':airbyte-api')
Expand Down
5 changes: 0 additions & 5 deletions airbyte-commons/src/main/resources/log4j2.xml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,6 @@

<Property name="s3-bucket">${sys:S3_LOG_BUCKET:-${env:S3_LOG_BUCKET}}</Property>
<Property name="s3-region">${sys:S3_LOG_BUCKET_REGION:-${env:S3_LOG_BUCKET_REGION}}</Property>
<Property name="s3-aws-key">${sys:AWS_ACCESS_KEY_ID:-${env:AWS_ACCESS_KEY_ID}}</Property>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msaffitz can you explain why the removal of these lines are necessary? What happens if you introduce the sts library without removing these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure; I had the same question above (#263 (comment)) and @bothra90 indicated that removing them was necessary to get things functioning. @bothra90 -- do you have more context on this?

Copy link

@bothra90 bothra90 Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was unable to create a source from the console and seeing the following error log in airbyte-server process:

Cannot end publish with com.van.logging.aws.S3PublishHelper@53f4b9a9 due to error:
Cannot end publishing: Cannot publish to S3: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code:
InvalidAccessKeyId; Request ID: TY5JJZ76SMD9BA86; S3 Extended Request ID: WEK4bCkgavHUqbgoywsSNVMwBt14/ke89r9mC/usS879oBQ1MYNw+S/2urb3xedmfWxWBpfj90TMLqnYL+hCfw==; Proxy: null)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this internally with some folks that have more familiarity around our Log4J2 setup. Simply removing these properties won't work -- we use an underlying log4j-appender implementation that will need to support IRSA.

We opened an issue for this in that project so we'll wait and see what the maintainer says.

Copy link

@bothra90 bothra90 Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, it did the job for me. Happy to jump on a quick Slack huddle and demo it to you or @davinchia

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that the underlying appender library that I linked above actually supports roles by default. The s3-aws-key and s3-aws-secret properties are optional, for situations where role-based credentials aren't available. That said, I'm concerned that if we simply delete these properties from the xml configuration, we could break Airbyte installations that are using those properties instead of roles.

An ideal solution would be backwards compatible for such installations, though I'm not sure what that would look like, since it seems like as soon as these tags are defined, that appender will try to use them, even if the value is blank

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the concern. Ultimately though, the passed parameters get used here and if the s3AwsSecret and s3AwsKey are not provided, the library will use the default AWS providers chain which will correctly use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. This is how these variables in the configuration are populated in the first place, so I don't suspect this would be an issue.

Having said that, I can try to test this out on my end and make sure nothing breaks.

Copy link

@weichunnn weichunnn Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconded @bothra90's comment. Given that Airbyte is already following the standard naming conversion for storing AWS keys which is how it's currently being passed as environment variables, I also don't see any issue with letting the DefaultAWSCredentialsProviderChain handle it instead of being passed explicitly.

Using the ProviderChain would be backwards compatible with existing setup as any existing or new credentials will just be picked up according to their priority - in this case ENV first followed by roles / instance profile

Reference: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bothra90 @weichunnn thanks for pointing this out, and apologies for the slow-moving back and forth here. I agree that it seems like these XML properties look unnecessary. @bothra90 if you're able to test and verify that this works as expected on your end, I'd love to see your results.

Otherwise, I'll open this PR against our internal repo where I can run our full suite of tests. Thanks for bearing with me here!

Copy link

@weichunnn weichunnn Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pmossman! Checking in to see if there are any updates on the review internally? 👀

<Property name="s3-aws-secret">${sys:AWS_SECRET_ACCESS_KEY:-${env:AWS_SECRET_ACCESS_KEY}}</Property>
<Property name="s3-minio-endpoint">${sys:S3_MINIO_ENDPOINT:-${env:S3_MINIO_ENDPOINT}}</Property>
<Property name="s3-path-style-access">${sys:S3_PATH_STYLE_ACCESS:-${env:S3_PATH_STYLE_ACCESS}}</Property>

Expand Down Expand Up @@ -102,7 +100,6 @@
verbose="true"
stagingBufferAge="1"
s3Bucket="${s3-bucket}" s3Path="job-logging${ctx:cloud_job_log_path}" s3Region="${s3-region}"
s3AwsKey="${s3-aws-key}" s3AwsSecret="${s3-aws-secret}"
s3ServiceEndpoint="${s3-minio-endpoint}" s3PathStyleAccess="${s3-path-style-access}"
gcpStorageBucket="${gcs-log-bucket}" gcpStorageBlobNamePrefix="job-logging${ctx:cloud_job_log_path}">
<PatternLayout pattern="${default-pattern}"/>
Expand All @@ -126,7 +123,6 @@
verbose="true"
stagingBufferAge="1"
s3Bucket="${s3-bucket}" s3Path="job-logging${ctx:cloud_job_log_path}" s3Region="${s3-region}"
s3AwsKey="${s3-aws-key}" s3AwsSecret="${s3-aws-secret}"
s3ServiceEndpoint="${s3-minio-endpoint}" s3PathStyleAccess="${s3-path-style-access}"
gcpStorageBucket="${gcs-log-bucket}" gcpStorageBlobNamePrefix="job-logging${ctx:cloud_job_log_path}">
<PatternLayout pattern="${simple-pattern}"/>
Expand Down Expand Up @@ -168,7 +164,6 @@
<Log4j2Appender name="app-logging/${ctx:cloud_workspace_app_root}/"
stagingBufferAge="1"
s3Bucket="${s3-bucket}" s3Path="app-logging${ctx:cloud_workspace_app_root}" s3Region="${s3-region}"
s3AwsKey="${s3-aws-key}" s3AwsSecret="${s3-aws-secret}"
s3ServiceEndpoint="${s3-minio-endpoint}" s3PathStyleAccess="${s3-path-style-access}"
gcpStorageBucket="${gcs-log-bucket}" gcpStorageBlobNamePrefix="app-logging${ctx:cloud_workspace_app_root}">
<PatternLayout pattern="${default-pattern}"/>
Expand Down
1 change: 1 addition & 0 deletions airbyte-config/config-models/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ dependencies {
annotationProcessor libs.lombok
implementation libs.google.cloud.storage
implementation libs.aws.java.sdk.s3
implementation libs.aws.java.sdk.sts
implementation libs.s3
implementation libs.bundles.apache

Expand Down
1 change: 1 addition & 0 deletions airbyte-oauth/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ dependencies {
implementation libs.bundles.apache
implementation libs.appender.log4j2
implementation libs.aws.java.sdk.s3
implementation libs.aws.java.sdk.sts

implementation project(':airbyte-commons')
implementation project(':airbyte-config:config-models')
Expand Down
3 changes: 2 additions & 1 deletion airbyte-server/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ configurations.all {
// instead of versions from transitive dependencies
// Force to avoid updated version brought in transitively from Micronaut 3.8+
// that is incompatible with our current Helm setup
force libs.flyway.core, libs.s3, libs.aws.java.sdk.s3
force libs.flyway.core, libs.s3, libs.aws.java.sdk.s3, libs.aws.java.sdk.sts
}
}

Expand All @@ -34,6 +34,7 @@ dependencies {
implementation libs.flyway.core
implementation libs.s3
implementation libs.aws.java.sdk.s3
implementation libs.aws.java.sdk.sts

implementation project(':airbyte-analytics')
implementation project(':airbyte-api')
Expand Down
3 changes: 2 additions & 1 deletion airbyte-workers/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ configurations.all {
resolutionStrategy {
// Ensure that the versions defined in deps.toml are used
// instead of versions from transitive dependencies
force libs.flyway.core, libs.jooq, libs.s3, libs.aws.java.sdk.s3
force libs.flyway.core, libs.jooq, libs.s3, libs.aws.java.sdk.s3, libs.aws.java.sdk.sts
}
}

Expand All @@ -47,6 +47,7 @@ dependencies {
implementation libs.jooq
implementation libs.s3
implementation libs.aws.java.sdk.s3
implementation libs.aws.java.sdk.sts
implementation 'com.google.auth:google-auth-library-oauth2-http:1.4.0'
implementation 'com.auth0:java-jwt:3.19.2'
implementation libs.kubernetes.client
Expand Down
1 change: 1 addition & 0 deletions deps.toml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ apache-cxf-core = { module = "org.apache.cxf:cxf-core", version = "3.4.2" }
appender-log4j2 = { module = "com.therealvan:appender-log4j2", version = "3.6.0" }
assertj-core = { module = "org.assertj:assertj-core", version = "3.21.0" }
aws-java-sdk-s3 = { module = "com.amazonaws:aws-java-sdk-s3", version = "1.12.6" }
aws-java-sdk-sts = {module = "com.amazonaws:aws-java-sdk-sts", version = "1.12.6"}
byte-buddy = { module = "net.bytebuddy:byte-buddy", version = "1.12.14" }
commons-io = { module = "commons-io:commons-io", version.ref = "commons_io" }
connectors-testcontainers = { module = "org.testcontainers:testcontainers", version.ref = "connectors-testcontainers" }
Expand Down
Loading