diff --git a/docs/connector-development/README.md b/docs/connector-development/README.md index c54df6ca7fa8..f1e6cb6f2d63 100644 --- a/docs/connector-development/README.md +++ b/docs/connector-development/README.md @@ -1,30 +1,32 @@ # Connector Development -### Before you start +If you'd like to build a connector that doesn't yet exist in Airbyte's catalog, in most cases you should use [Connector Builder](./connector-builder-ui/overview.md)! +Builder works for most API source connectors as long as you can read the data with HTTP requests (REST, GraphQL) and get results in JSON or JSONL formats, CSV and XML support to come soon. -Before building a new connector, review [Airbyte's data protocol specification](../understanding-airbyte/airbyte-protocol.md). As you begin, you should also familiarize yourself with our guide to [Best Practices for Connector Development](./best-practices.md). +In rare cases when you need something more complex, you can use the Low-Code CDK directly. Other options and SDKs are described below. -If you need support along the way, visit the [Slack channel](https://airbytehq.slack.com/archives/C027KKE4BCZ) we have dedicated to helping users with connector development where you can search previous discussions or ask a question of your own. +:::note -### Process overview +Before building a new connector, review [Airbyte's data protocol specification](../understanding-airbyte/airbyte-protocol.md). As you begin, you should also familiarize yourself with our guide to [Best Practices for Connector Development](./best-practices.md). +If you need support along the way, visit the [Slack channel](https://airbytehq.slack.com/archives/C027KKE4BCZ) we have dedicated to helping users with connector development where you can search previous discussions or ask a question of your own. -The first step in creating a new connector is to choose the tools you’ll use to build it. There are three basic approaches Airbyte provides to start developing a connector. To understand which approach you should take, review the [compatibility guide](./connector-builder-ui/connector-builder-compatibility.md). +::: -After building and testing your connector, you’ll need to publish it. This makes it available in your workspace. At that point, you can use the connector you’ve built to move some data! +### Process overview -If you want to contribute what you’ve built to the Airbyte Cloud and OSS connector catalog, follow the steps provided in the [contribution guide for submitting new connectors](../contributing-to-airbyte/submit-new-connector.md). +1. **Pick the technology and build**. The first step in creating a new connector is to choose the tools you’ll use to build it. For _most_ cases, you should start in Connector Builder. To understand which approach you should take, review the [compatibility guide](./connector-builder-ui/connector-builder-compatibility.md). +2. **Publish as a custom connector**.After building and testing your connector, you’ll need to publish it. This makes it available in your workspace. At that point, you can use the connector you’ve built to move some data! +3. **Contribute back to Airbyte**. If you want to contribute what you’ve built to the Airbyte Cloud and OSS connector catalog, follow the steps provided in the [contribution guide for submitting new connectors](../contributing-to-airbyte/submit-new-connector.md). ### Connector development options -| Tool | Description | -| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| [Connector Builder](./connector-builder-ui/overview.md) | We recommend Connector Builder for developing a connector for an API source. If you’re using Airbyte Cloud, no local developer environment is required to create a new connection with the Connector Builder because you configure it directly in the Airbyte web UI. This tool guides you through creating and testing a connection. Refer to our [tutorial](./connector-builder-ui/tutorial.mdx) on the Connector Builder to guide you through the basics. | -| [Low Code Connector Development Kit (CDK)](./config-based/low-code-cdk-overview.md) | This framework lets you build source connectors for HTTP API sources. The Low-code CDK is a declarative framework that allows you to describe the connector using a [YAML schema](./schema-reference) without writing Python code. It’s flexible enough to include [custom Python components](./config-based/advanced-topics.md#custom-components) in conjunction with this method if necessary. | -| [Python Connector Development Kit (CDK)](./cdk-python/basic-concepts.md) | While this method provides the most flexibility to developers, it also requires the most code and maintenance. This library provides classes that work out-of-the-box for most scenarios you’ll encounter along with the generators to make the connector scaffolds for you. We maintain an [in-depth guide](./tutorials/custom-python-connector/0-getting-started.md) to building a connector using the Python CDK. | +| Tool | Description | +| ----------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| [Connector Builder](./connector-builder-ui/overview.md) | We recommend Connector Builder for developing a connector for an API source. If you’re using Airbyte Cloud, no local developer environment is required to create a new connection with the Connector Builder because you configure it directly in the Airbyte web UI. This tool guides you through creating and testing a connection. Refer to our [tutorial](./connector-builder-ui/tutorial.mdx) on the Connector Builder to guide you through the basics. | +| [Low Code Connector Development Kit (CDK)](./config-based/low-code-cdk-overview.md) | This framework lets you build source connectors for HTTP API sources. The Low-code CDK is a declarative framework that allows you to describe the connector using a [YAML schema](./schema-reference) without writing Python code. It’s flexible enough to include [custom Python components](./config-based/advanced-topics.md#custom-components) in conjunction with this method if necessary. | +| [Python Connector Development Kit (CDK)](./cdk-python/basic-concepts.md) | While this method provides the most flexibility to developers, it also requires the most code and maintenance. This library provides classes that work out-of-the-box for most scenarios you’ll encounter along with the generators to make the connector scaffolds for you. We maintain an [in-depth guide](./tutorials/custom-python-connector/0-getting-started.md) to building a connector using the Python CDK. | +| [Java CDK](./tutorials/building-a-java-destination.md) | If you're bulding a source or a destination against a traditional database (not an HTTP API, not a vector database), you should use the Java CDK instead. | -Most database sources and destinations are written in Java. API sources and destinations are written -in Python using the [Low-code CDK](config-based/low-code-cdk-overview.md) or -[Python CDK](cdk-python/). ### Community maintained CDKs diff --git a/docs/connector-development/migration-to-base-image.md b/docs/connector-development/cdk-python/migration-to-base-image.md similarity index 98% rename from docs/connector-development/migration-to-base-image.md rename to docs/connector-development/cdk-python/migration-to-base-image.md index 03c6f6c9f508..61e20c2ca5a2 100644 --- a/docs/connector-development/migration-to-base-image.md +++ b/docs/connector-development/cdk-python/migration-to-base-image.md @@ -3,7 +3,7 @@ We currently enforce our certified python connectors to use our [base image](https://hub.docker.com/r/airbyte/python-connector-base). This guide will help connector developers to migrate their connector to use our base image. -N.B: This guide currently only applies to python connectors. +N.B: This guide currently only applies to Python CDK connectors. ## Prerequisite diff --git a/docs/connector-development/config-based/low-code-cdk-overview.md b/docs/connector-development/config-based/low-code-cdk-overview.md index 3a772d978b8d..b0d241d80624 100644 --- a/docs/connector-development/config-based/low-code-cdk-overview.md +++ b/docs/connector-development/config-based/low-code-cdk-overview.md @@ -3,11 +3,7 @@ Airbyte’s low-code framework enables you to build source connectors for REST APIs via a [connector builder UI](https://docs.airbyte.com/connector-development/connector-builder-ui/overview) or by modifying boilerplate YAML files via terminal or text editor. :::info -Developer updates will be announced via our #help-connector-development Slack channel. If you are using the CDK, please join to stay up to date on changes and issues. -::: - -:::note -The low-code framework is in **beta**, which means that while it will be backwards compatible, it’s still in active development. Share feedback and requests with us on our [Slack channel](https://slack.airbyte.com/) or email us at [feedback@airbyte.io](mailto:feedback@airbyte.io) +Developer updates will be announced via our [#help-connector-development Slack channel](https://airbytehq.slack.com/archives/C027KKE4BCZ). If you are using the CDK, please join to stay up to date on changes and issues. ::: ## Why low-code? @@ -65,7 +61,7 @@ If the answer to all questions is yes, you can use the low-code framework to bui ## Prerequisites - An API key for the source you want to build a connector for -- Python >= 3.9 +- Python >= 3.10 - Docker ## Overview of the process diff --git a/docs/connector-development/config-based/tutorial/0-getting-started.md b/docs/connector-development/config-based/tutorial/0-getting-started.md index 7e037c9979fa..7a57374263f5 100644 --- a/docs/connector-development/config-based/tutorial/0-getting-started.md +++ b/docs/connector-development/config-based/tutorial/0-getting-started.md @@ -1,7 +1,5 @@ # Getting Started -:warning: This framework is in **alpha**. It is still in active development and may include backward-incompatible changes. Please share feedback and requests directly with us at feedback@airbyte.io :warning: - ## Summary Throughout this tutorial, we'll walk you through the creation of an Airbyte source to read and extract data from an HTTP API. @@ -41,7 +39,7 @@ This can be done by signing up for the Free tier plan on [Exchange Rates Data AP ## Requirements - An Exchange Rates API key -- Python >= 3.9 +- Python >= 3.10 - [Poetry](https://python-poetry.org/) - Docker must be running - [`airbyte-ci`](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md#L1) CLI diff --git a/docs/connector-development/connector-builder-ui/overview.md b/docs/connector-development/connector-builder-ui/overview.md index 13e1286c6f1e..25fd086e52a5 100644 --- a/docs/connector-development/connector-builder-ui/overview.md +++ b/docs/connector-development/connector-builder-ui/overview.md @@ -1,56 +1,41 @@ # Connector Builder Intro -Connector Builder is a no-code tool that’s part of the Airbyte UI. It provides an intuitive user interface on top of the [low-code YAML format](https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/yaml-overview) and lets you develop a connector to use in data syncs without ever needing to leave your Airbyte workspace. Connector Builder offers the most straightforward method for building and maintaining connectors. - -We recommend that you determine whether the connector you want can be built with the Connector Builder before looking at the Low-Code CDK or Python CDK. Our [compatibility guide](./connector-builder-compatibility.md) can help you decide if Connector Builder is the right tool to use. +Connector Builder is a no-code tool that’s part of the Airbyte UI. +It provides an intuitive user interface on top of the [low-code YAML format](https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/yaml-overview) and lets you develop a connector to use in data syncs without ever needing to leave your Airbyte workspace. +Connector Builder offers the most straightforward method for building, contributing, and maintaining connectors. ## When should I use Connector Builder? -First, check if the API you want to use has an available connector in the [catalog](../../integrations). If you find it there, you can use it as is. If you need to update an existing connector, see the guide for updates. +First, check if the API you want to use has an available connector in the [catalog](../../integrations). If you find it there, you can use it as is. If you need to update an existing connector, see the guide for updates. -Generally, you can build a connector with the Connector Builder if you want to connect to an HTTP API that returns a collection of records as JSON and has fixed endpoints. For more detailed information on requirements, refer to the [compatibility guide](./connector-builder-compatibility.md). +You can build a connector with the Connector Builder if you want to connect to an HTTP API that returns a collection of records as JSON and has fixed endpoints. For more detailed information on requirements, refer to the [compatibility guide](./connector-builder-compatibility.md). ## Getting started The high-level process for using Connector Builder is as follows: -1. Access Connector Builder in the Airbyte web app by selecting "Builder" in the left-hand sidebar. -2. Iterate on your low-code connector by providing details for global configuration and user inputs. User inputs are the variables your connector will ask an end-user to provide when they configure a connector for use in a connection. -3. Once the connector is ready, publish it. This makes it available in your local workspace +1. Access Connector Builder in the Airbyte web app by selecting "Builder" in the left-hand sidebar +2. Iterate on the connector by providing details for global configuration and user inputs, and streams +3. Once the connector is ready, publish it to your workspace, or contribute it to Airbyte catalog 4. Configure a Source based on the released connector 5. Use the Source in a connection to sync data -The concept pages in this section of the docs share more details related to the following topics: [authentication](./authentication.md), [record processing](./record-processing.mdx), [pagination](./pagination.md), [incremental sync](./incremental-sync.md), [partitioning](./partitioning.md), and [error handling](./error-handling.md). +The concept pages in this section of the docs share more details related to the following topics: [authentication](./authentication.md), [record processing](./record-processing.mdx), [pagination](./pagination.md), [incremental sync](./incremental-sync.md), [partitioning](./partitioning.md), and [error handling](./error-handling.md). :::tip -Do not hardcode things like API keys or passwords while configuring a connector in the builder. They will be used, but not saved, during development when you provide them as Testing Values. For use in production, these should be passed in as user inputs after publishing the connector to the workspace, when you configure a source using your connector. +Do not hardcode things like API keys or passwords while configuring a connector in the builder. They will be used, but not saved, during development when you provide them as Testing Values. For use in production, these should be passed in as user inputs after publishing the connector to the workspace, when you configure a source using your connector. Follow [the tutorial](./tutorial.mdx) for an example of what this looks like in practice. ::: -## Exporting the connector - -:::info -If you choose to contribute your connector to the Airbyte connector catalog, making it publicly available outside of your workspace, you'll need to export it and go through the process of submitting it for review. -::: - -Connector Builder leverages the [low-code CDK](https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/yaml-overview) under the hood, turning all configurations into the YAML format. Typically, it's not necessary to interact with the YAML representation. However, you can export the connector YAML into a file and build a docker image containing the connector which can be shared more widely: - -1. Use Connector Builder to iterate on your low-code connector -2. Export the YAML into a low-code connector module on your local machine -3. Build the connector's Docker image -4. Use the built connector image in Airbyte - -Once you're done iterating on your connector in the UI, you'll need to export the low-code YAML representation of the connector to your local filesystem into a connector module. This YAML can be downloaded by clicking the `Download Config` button in the bottom-left. - -Create a low-code connector module using the connector generator (see [this YAML tutorial for an example](../config-based/tutorial/1-create-source.md)) using the name you'd like to use for your connector. After creating the connector, overwrite the contents of `airbyte-integrations/connectors/source-/source_/manifest.yaml` with the YAML you created in the UI. +## Contributing the connector -Follow the instructions in the connector README to build the Docker image. Typically this will be something like `docker build . -t airbyte/source-:`. +If you'd like to share your connector with other Airbyte users, you can contribute it to Airbyte's GitHub repository right from the Builder. -From this point on your connector is a regular low-code CDK connector. It can now be distributed as a docker image and be made part of the regular Airbyte connector catalog. For more information, read the [overview page for the publishing process](/connector-development/#publishing-a-connector). +1. Click "Publish" chevron -> "Contribute to Marketplace" +2. Fill out the form: add the connector description, and provide your GitHub PAT (Personal Access Token) to create a pull request +3. Click "Contribute" to submit the connector to the Airbyte catalog -:::note -Connector Builder UI is in beta, which means it’s still in active development and may include backward-incompatible changes. Share feedback and requests with us on our Slack channel or email us at feedback@airbyte.io +Reviews typically take under a week. -Developer updates will be announced via our #help-connector-development Slack channel. If you are using the CDK, please join to stay up to date on changes and issues. -::: \ No newline at end of file +You can also export the YAML manifest file for your connector and share it with others. The manifest file contains all the information about the connector, including the global configuration, streams, and user inputs. diff --git a/docs/connector-development/partner-certified-destinations.md b/docs/connector-development/partner-certified-destinations.md index cec71cf14265..2d11d228451c 100644 --- a/docs/connector-development/partner-certified-destinations.md +++ b/docs/connector-development/partner-certified-destinations.md @@ -4,36 +4,37 @@ **Thank you for contributing and committing to maintain your Airbyte destination connector 🥂** -This document outlines the minimum expectations for partner-certified destination. We will **strongly** recommend that partners use the relevant CDK, but also want to support developers that *need* to develop in a different language. This document covers concepts implicitly built into our CDKs for this use-case. +This document outlines the minimum expectations for partner-certified destination. We will **strongly** recommend that partners use the relevant CDK, but also want to support developers that *need* to develop in a different language. This document covers concepts implicitly built into our CDKs for this use-case. ## Definitions **Partner Certified Destination:** A destination which is fully supported by the maintainers of the platform that is being loaded to. These connectors are not guaranteed by Airbyte directly, but instead the maintainers of the connector contribute fixes and improvements to ensure a quality experience for Airbyte users. Partner destinations are noted as such with a special “Partner” badge on the Integrations page, distinguishing them from other community maintained connectors on the Marketplace. -**Bulk Destinations:** A destination which accepts tables and columns as input, files, or otherwise unconstrained content. The majority of bulk destinations are database-like tabular (warehouses, data lakes, databases), but may also include file or blob destinations. The defining characteristic of bulk destinations is that they accept data in the shape of the source (e.g. tables, columns or content doesn’t change much from the representation of the source). These destinations can usually hold large amounts of data, and are the fastest to load. +**Bulk Destinations:** A destination which accepts tables and columns as input, files, or otherwise unconstrained content. The majority of bulk destinations are database-like tabular (warehouses, data lakes, databases), but may also include file or blob destinations. The defining characteristic of bulk destinations is that they accept data in the shape of the source (e.g. tables, columns or content doesn’t change much from the representation of the source). These destinations can usually hold large amounts of data, and are the fastest to load. -**Publish Destinations:** A publish-type destination, often called a “reverse ETL” destination loads data to an external service or API. These destinations may be “picky”, having specific schema requirements for incoming streams. Common publish-type use cases include: publishing data to a REST API, publishing data to a messaging endpoint (e.g email, push notifications, etc.), and publishing data to an LLM vector store. Specific examples include: Destination-Pinecone, Destination-Vectara, and Destination-Weaviate. These destinations can usually hold finite amounts of data, and slower to load. +**Publish Destinations:** A publish-type destination, often called a “reverse ETL” destination loads data to an external service or API. These destinations may be “picky”, having specific schema requirements for incoming streams. Common publish-type use cases include: publishing data to a REST API, publishing data to a messaging endpoint (e.g email, push notifications, etc.), and publishing data to an LLM vector store. Specific examples include: Destination-Pinecone, Destination-Vectara, and Destination-Weaviate. These destinations can usually hold finite amounts of data, and slower to load. ## “Partner-Certified" Listing Requirements: -### Issue Tracking: +### Issue Tracking: Create a public Github repo/project to be shared with Airbyte and it's users. -### Airbyte Communications: +### Airbyte Communications: Monitor a Slack channel for communications directly from the Airbyte Support and Development teams. -### SLAs: +### SLAs: Respect a 3 business day first response maximum to customer inquries or bug reports. -### Metrics: + +### Metrics: Maintain >=95% first-sync success and >=95% overall sync success on your destination connector. _Note: config_errors are not counted against this metric._ -### Platform Updates: +### Platform Updates: Adhere to a regular update cadence for either the relevant Airbyte-managed CDK, or a commit to updating your connector to meet any new platform requirements at least once every 6 months. -### Connector Updates: +### Connector Updates: Important bugs are audited and major problems are solved within a reasonable timeframe. -### Security: +### Security: Validate that the connector is using HTTPS and secure-only access to customer data. @@ -46,11 +47,11 @@ We won’t call out every requirement of the Airbyte Protocol (link) but below a * Destinations must capture state messages from sources, and must emit those state messages to STDOUT only after all preceding records have been durably committed to the destination * The Airbyte platform interprets state messages emitted from the destination as a logical checkpoint. Destinations must emit all of the state messages they receive, and only after records have been durably written and/or committed to the destination’s long-term storage. * If a destination emits the source’s state message before preceding records are finalized, this is an error. - * _Note: In general, state handling should always be handled by the respective CDK. Destination authors should not attempt to handle this themselves._ + * _Note: In general, state handling should always be handled by the respective CDK. Destination authors should not attempt to handle this themselves._ * Destinations must append record counts to the Source’s state message before emitting (New for Airbyte 1.0) * For each state record emitted, the destination should attach to the state message the count of records processed and associated with that state message. - * This should always be handled by the Python or Java CDK. Destination authors should not attempt to handle this themselves. + * This should always be handled by the Python or Java CDK. Destination authors should not attempt to handle this themselves. * State messages should be emitted with no gap longer than 15 minutes * Checkpointing requires commit and return state every 15 minutes. When batching records for efficiency, destination should also include logic to finalize batches approximately every 10 minutes, or whatever interval is appropriate to meet the minimum 15 minute checkpoint frequency. @@ -77,7 +78,7 @@ _Note: Because **Publish Destinations** have little control over table structure * Bulk Destinations must utilize _airbyte_meta.changes[] to record in-flight fixes or changes * This includes logging information on any fields that had to be nullified due to destination capacity restrictions (e.g. data could not fit), and/or problematic input data (e.g. impossible date or out-of-range date). - * It’s also OK for the destination to make record changes (e.g. property too large to fit) as long as the change doesn’t apply to the PK or cursor, and the change is record in _airbyte_meta.changes[] as well. + * It’s also OK for the destination to make record changes (e.g. property too large to fit) as long as the change doesn’t apply to the PK or cursor, and the change is record in _airbyte_meta.changes[] as well. * Bulk Destinations must accept new columns arriving from the source. (“Schema Evolution”) * Tabular destinations should be consistent in how they handle schema evolutions over the period of a connection’s lifecycle, including gracefully handling expected organic schema evolutions, including the addition of new columns after the initial sync. diff --git a/docs/connector-development/schema-reference.md b/docs/connector-development/schema-reference.md index e243d76d6fac..9241923d0a72 100644 --- a/docs/connector-development/schema-reference.md +++ b/docs/connector-development/schema-reference.md @@ -1,5 +1,10 @@ # Schema Reference +:::note +You only need this if you're building a connector with Python or Java CDKs. +If you're using Connector Builder, you can use [declared schemas](./connector-builder-ui/record-processing#declared-schema) instead. +::: + This document provides instructions on how to create a static schema for your Airbyte stream, which is necessary for integrating data from various sources. You can check out all the supported data types and examples at [this link](../understanding-airbyte/supported-data-types.md). @@ -66,5 +71,3 @@ The schema is then translated into the following JSON format. Please note that i } } ``` - -We hope this guide helps you create a successful static schema for your Airbyte stream. Please don't hesitate to reach out if you have any further questions or concerns.