-
Notifications
You must be signed in to change notification settings - Fork 40
provide ability to cache boto client instances directly and on S3Bucket
#369
Conversation
S3Bucket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
prefect_aws/s3.py
Outdated
cache_client: bool = Field( | ||
default=False, | ||
description=( | ||
"If True, the S3 client will be cached. This is useful for " | ||
"performance, but could cause issues if the S3 client is used " | ||
"in multi-threaded environments." | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this toggle makes more sense on the AwsCredentials
class since it's closer to where the client is instantiated and allows other consumers of AwsCredentials
to use the toggle.
Also, do you know what issues client caching can cause in multi-threaded environments and if we can mitigate those issues? I'd prefer to not expose this toggle if we don't have to.
tests/test_credentials.py
Outdated
|
||
|
||
@pytest.mark.parametrize("client_type", [member.value for member in ClientType]) | ||
def test_get_client_cached(client_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test that ensures that changing some configuration on an AwsCredentials
instance after instantiation and fetching a client causes a cache miss?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
closes #332
This PR introduces an enhanced default caching strategy for boto client instances within our
AwsCredentials
andMinIOCredentials
classes.lru_cache
mechanism, keyed on the credentials context (ctx
) and client type.__hash__
method ofAwsCredentials
andMinIOCredentials
is based on the combination of these fields, ensuring that any changes to the credentials object results in a different hash and thus a new cache entry._LOCK
), which synchronizes access to theget_client
function, preventing race conditions in multi-threaded environments.thanks to @mattiamatrix for doing the bulk of this work