Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Idea: PyAirbyte CLI #409

Closed
aaronsteers opened this issue Oct 5, 2024 · 1 comment
Closed

Feature Idea: PyAirbyte CLI #409

aaronsteers opened this issue Oct 5, 2024 · 1 comment

Comments

@aaronsteers
Copy link
Contributor

aaronsteers commented Oct 5, 2024

Let's add a CLI for PyAirbyte...

Decision 1: Entrypoint Name

We could use pyairbyte or airbyte as the entrypoint (CLI) name.

While airbyte matches the library name, I think I slightly prefer pyairbyte or another CLI name, to be clear that users are invoking pyairbyte and more clearly distinguish from Airbyte Platform, Terraform, abctl, and any other Airbyte REST API wrapper.

I like pyairbyte with the more concise alias pyab providing the same functionality.

Decision 2: Verb selection

I think I lean towards pairing these to match the classic verbs: read, write, discover, etc. - except that we're going to be streamlining the invocation for people quite a lot.

Decision 3: Workload Descriptions: Yaml vs CLI args

We can save users some headache and make the CLI more concise by letting them use yaml files to describe jobs. Since our first use case for the CLI will likely match to what we already have for acceptance tests, I think we should make sure we at least support the acceptance-test-config.yaml format or the new inlining of this into metadata.yaml files.

I lean towards the following implementation:

  • Let users provide a workload name, which loosely correlates to the same name used in a config file, subtracting /secrets/ path prefix and the .json suffix.
  • If no workload name is provided, we'll default to the first definition with the default config file name: config.json.
  • If more than one workload is defined, and no workload name is provided, and none are named config.json, then we'll fail and ask for a workload name.
  • We'll also let users override specific inputs by providing them via the command line.
  • CLI args take precedence over yaml config. Neither is required if the other is fully declarative.

Examples

Run the first 'full_refresh' workload defined in acceptance-test-config.yaml

pyab run --source=source-faker --source-job="acceptance-test-config.yaml:tests.full_refresh[0]" --destination=destination-snowflake --destination-config=../destination-snowflake/secrets/config.json

Run a performance benchmark using the same workload info.

pyab benchmark --source=source-faker --job="acceptance-test-config.yaml:tests.full_refresh[0]"
@aaronsteers aaronsteers changed the title Feat: PyAirbyte CLI Feature Idea: PyAirbyte CLI Oct 5, 2024
@aaronsteers
Copy link
Contributor Author

Resolved (first iteration) in:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant