Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add Cloud Interop and Robust Secrets Management #143

Merged
merged 120 commits into from
Apr 10, 2024

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Mar 26, 2024

Resolves: #33
Resolves: #104
Resolves: #54

What's new in the secrets module

  1. New, refactored secrets module. The previous secrets code is moved from a single secrets.py file, to submodules per topic/retriever. Most of the core code is unchanged, except...
  2. Renamed SecretSource to SecretSourceEnum, to better distinguish from the new SecretManager classes.
  3. New foundational classes:
    1. SecretString - inherits from str, but adds some security measures:
      • Calls to repr() will be masked. This includes printing a class or a dict that may contain secret properties.
      • parse_json() implementation streamlines cast to dict (a frequent user case), and ensures the exceptions don't inadvertently print the value.
    2. SecretManager - Has the get_secret(<name>) method.
    3. CustomSecretManager - Abstract base class for custom implementations. Custom secret managers have the additional behavior to register themselves into the list of secrets sources that are automatically checked when ab.get_secret() is called.
    4. SecretHandle - A pointer to a (not-yet-retrieved) secret. Allow streamlined iteration over a list of secrets and secret names without the cost/risk of retrieving locally.
  4. New GoogleGSMSecretManager class.
    • This replaces a dependency on ci_credentials. It inherits from SecretManager and also adds fetch_secrets, fetch_secrets_by_label and fetch_connector_secrets.
    • The fetch_connector_secrets code is what we use in integration tests to replace ci_credentials dependency`. This can now also be used at runtime to test any source, without requiring an import of any external libraries.

The above changes are performed in a backwards compatible way, so that ab.get_secret() still works just as before.

What's new in the exceptions module

All existing exceptions are the same except that AirbyteLib prefix is replaced by PyAirbyte and more care has been placed to ensure a distinction between Airbyte* exceptions (protocol) and PyAirbyte* exceptions (Python-specific). A group of new AirbyteCloud exceptions (from Cloud) have also been added with their own prefix.

Note:

  • The global rename of exceptions to be prefixed as PyAirbyte* has caused a large number of files to show as modified. (Sorry! 😊)

Reviewing Docs

I would highly recommend reviewing the autogenerated docs for both the secrest module and the cloud module. Those can be found from Actions here in github, as a downloadable artifact from the docs-generate job.

Here is a recent copy:

generated-docs.zip

To use, simply download the zip, double-click to decompress, then double-click to open the included "index.html" file.

Pre-Merge TODO

  1. Review my own self-review action items, below in this PR thread.
  2. Consider classes for removal or downgrade to experimental status if we don't need them - specifically the Cloud* classes.
  3. Consider removing or hiding the deployment and management capabilities for Cloud - specifically the deploy*() and delete*() methods.
  4. Perform a full review of the auto-generated docs - checking for any surface area that should remain non-public for now.
  5. Docs: Explain which secret names (or env-var names) are needed for reading from which destinations.
  6. Optionally merge the airbyte-api rename in the other repo - or else Pin to a specific commit instead of a branch ref.
  7. Check the API responses for additional properties we want to expose on SyncResult, CloudConnection classes.

@aaronsteers
Copy link
Contributor Author

aaronsteers commented Mar 26, 2024

Quick update. I've added create+delete tests for a source and destination. Both tests are passing. 🎉

This isn't the final API, but you can see how it works in the below excerpt:

def test_create_and_delete_source(
workspace_id: str,
api_root: str,
api_key: str,
) -> None:
new_resource_name = "deleteme-source-faker" + str(ulid.ULID()).lower()[-6:]
source_config = SourceFaker()
source = api_util.create_source(
name=new_resource_name,
api_root=api_root,
api_key=api_key,
workspace_id=workspace_id,
config=source_config,
)
assert source.name == new_resource_name
assert source.source_type == "faker"
assert source.source_id
api_util.delete_source(
source_id=source.source_id,
api_root=api_root,
api_key=api_key,
workspace_id=workspace_id,
)
def test_create_and_delete_destination(
workspace_id: str,
api_root: str,
api_key: str,
motherduck_api_key: str,
) -> None:
new_resource_name = "deleteme-destination-faker" + str(ulid.ULID()).lower()[-6:]
destination_config = DestinationDuckdb(
destination_path="temp_db",
motherduck_api_key=motherduck_api_key,
)
destination = api_util.create_destination(
name=new_resource_name,
api_root=api_root,
api_key=api_key,
workspace_id=workspace_id,
config=destination_config,
)
assert destination.name == new_resource_name
assert destination.destination_type == "duckdb"
assert destination.destination_id
api_util.delete_destination(
destination_id=destination.destination_id,
api_root=api_root,
api_key=api_key,
workspace_id=workspace_id,
)

@aaronsteers
Copy link
Contributor Author

aaronsteers commented Apr 9, 2024

@bindipankhudi - FYI, I just cleaned up the docs a bit more and put a link to a recent docs snapshot in the PR description.

I found it very helpful to review the docs and you might find so also (especially the secrets and cloud submodules):

generated-docs.zip

Copy link
Contributor

@bindipankhudi bindipankhudi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! :) Looks great to me. Left some comments, nothing blocking.

@aaronsteers aaronsteers changed the title Feat: Add Cloud interop via Airbyte API Integrations Feat: Add Cloud Interop and Robust Secrets Management Apr 10, 2024
@aaronsteers aaronsteers enabled auto-merge (squash) April 10, 2024 20:57
@aaronsteers aaronsteers merged commit 3a7ba19 into main Apr 10, 2024
17 checks passed
@aaronsteers aaronsteers deleted the aj/feat/add-airbyte-api-library branch April 10, 2024 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment