Skip to content

Commit

Permalink
updated docs and spec field description
Browse files Browse the repository at this point in the history
  • Loading branch information
darynaishchenko committed Aug 21, 2023
1 parent 1056733 commit cf210e6
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 16 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ connectionSpecification:
row_batch_size:
type: integer
title: Row Batch Size
description: The number of rows fetched when making a Google Sheet API call. Defaults to 200.
description: The number of rows fetched when making a Google Sheet API call. Defaults to 200. You can increase this value according to your needs to avoid rate limits if your data is particularly wide.
default: 200
examples:
- 50
Expand Down
8 changes: 4 additions & 4 deletions docs/integrations/sources/google-sheets.inapp.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Prerequisites

- Access to a Google Sheet
- Spreadsheet Link - The link to the Google spreadsheet you want to sync.
- A Google Workspace user with access to the spreadsheet

:::info
The Google Sheets source connector pulls data from a single Google Sheets spreadsheet. To replicate multiple spreadsheets, set up multiple Google Sheets source connectors in your Airbyte instance.
Expand All @@ -13,10 +13,10 @@ The Google Sheets source connector pulls data from a single Google Sheets spread
- **(Recommended)** To authenticate your Google account via OAuth, click **Sign in with Google** and complete the authentication workflow.
- To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Make sure the Service Account has the Project Viewer permission. If your spreadsheet is viewable by anyone with its link, no further action is needed. If not, [give your Service account access to your spreadsheet](https://youtu.be/GyomEw5a2NQ%22).
3. For **Spreadsheet Link**, enter the link to the Google spreadsheet. To get the link, go to the Google spreadsheet you want to sync, click **Share** in the top right corner, and click **Copy Link**.
4. For **Row Batch Size**, define the number of records you want the Google API to fetch at a time. The default value is 200.
4. For **Row Batch Size**, define the number of records you want the Google API to fetch at a time. The default value is 200. You can increase this value according to your needs to avoid rate limits if your data is particularly wide.
5. For **Convert Column Names to SQL-Compliant Format**, enable to use the conversion of column names to a standardized, SQL-compliant format. For example, 'My Name' -> 'my_name'. Enable this option if your destination is SQL-based.

### Google Sheets format requirements
- Sheet names and column headers must only contain alphanumeric characters or `_`, as specified in the [**Airbyte Protocol**](../../understanding-airbyte/airbyte-protocol.md). For example, if your sheet or column header is named `the data`, rename it to `the_data`. This restriction does not apply to non-header cell values.
- Airbyte only supports replicating [Grid](https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/sheets#SheetType) sheets.

For detailed information on supported sync modes, supported streams, performance considerations, refer to the full documentation for [Google Sheets](https://docs.airbyte.com/integrations/sources/google-sheets/).
26 changes: 15 additions & 11 deletions docs/integrations/sources/google-sheets.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,15 @@ The Google Sheets source connector pulls data from a single Google Sheets spread

### Prerequisites
- Spreadsheet Link - The link to the Google spreadsheet you want to sync.
- Authentication - Credentials for connecting to the Google Sheets API.
<env:cloud>
- **For Airbyte Cloud** A Google Workspace user with access to the spreadsheet
</env:cloud>
<env:oss>
- **For Airbyte Open Source:**
- A GCP project
- Enable the Google Sheets API in your GCP project
- Service Account Key with access to the Spreadsheet you want to replicate
</env:oss>

## Setup guide

Expand All @@ -26,8 +34,8 @@ To set up Google Sheets as a source in Airbyte Cloud:
- **(Recommended)** To authenticate your Google account via OAuth, click **Sign in with Google** and complete the authentication workflow.
- To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Make sure the Service Account has the Project Viewer permission. If your spreadsheet is viewable by anyone with its link, no further action is needed. If not, [give your Service account access to your spreadsheet](https://youtu.be/GyomEw5a2NQ%22).
6. For **Spreadsheet Link**, enter the link to the Google spreadsheet. To get the link, go to the Google spreadsheet you want to sync, click **Share** in the top right corner, and click **Copy Link**.
7. For **Row Batch Size**, define the number of records you want the Google API to fetch at a time. The default value is 200.
7. For **Columns Name Conversion**, if set as true column names will be transformed. Examples: My_Name -> my_name, My Name -> my_name, MyName -> my_name, My123name -> my_123_name, привіт світ -> privit_svit.
7. For **Row Batch Size**, define the number of records you want the Google API to fetch at a time. The default value is 200. You can increase this value according to your needs to avoid rate limits if your data is particularly wide.
8. For **Convert Column Names to SQL-Compliant Format**, enable to use the conversion of column names to a standardized, SQL-compliant format. For example, 'My Name' -> 'my_name'. Enable this option if your destination is SQL-based.

<!-- /env:cloud -->

Expand All @@ -50,15 +58,12 @@ To set up Google Sheets as a source in Airbyte Open Source:
- To authenticate your Google account via OAuth, enter your Google application's [client ID, client secret, and refresh token](https://developers.google.com/identity/protocols/oauth2).
- To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Make sure the Service Account has the Project Viewer permission. If your spreadsheet is viewable by anyone with its link, no further action is needed. If not, [give your Service account access to your spreadsheet](https://youtu.be/GyomEw5a2NQ%22).
6. For **Spreadsheet Link**, enter the link to the Google spreadsheet. To get the link, go to the Google spreadsheet you want to sync, click **Share** in the top right corner, and click **Copy Link**.
7. For **Row Batch Size**, define the number of records you want the Google API to fetch at a time. The default value is 200.
8. For **Columns Name Conversion**, if set as true column names will be transformed. Examples: My_Name -> my_name, My Name -> my_name, MyName -> my_name, My123name -> my_123_name, привіт світ -> privit_svit.
7. For **Row Batch Size**, define the number of records you want the Google API to fetch at a time. The default value is 200. You can increase this value according to your needs to avoid rate limits if your data is particularly wide.
8. For **Convert Column Names to SQL-Compliant Format**, enable to use the conversion of column names to a standardized, SQL-compliant format. For example, 'My Name' -> 'my_name'. Enable this option if your destination is SQL-based.

### Output schema

Each sheet in the selected spreadsheet is synced as a separate stream. Each selected column in the sheet is synced as a string field.
:::note
Sheet names and column headers must contain only alphanumeric characters or `_`, as specified in the** [**Airbyte Protocol**](../../understanding-airbyte/airbyte-protocol.md). For example, if your sheet or column header is named `the data`, rename it to `the_data`. This restriction does not apply to non-header cell values.
:::

Airbyte only supports replicating [Grid](https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/sheets#SheetType) sheets.

Expand All @@ -82,14 +87,13 @@ The Google Sheets source connector supports the following sync modes:
The [Google API rate limit](https://developers.google.com/sheets/api/limits) is 100 requests per 100 seconds per user and 500 requests per 100 seconds per project. Airbyte batches requests to the API in order to efficiently pull data and respects these rate limits. We recommended not using the same service user for more than 3 instances of the Google Sheets source connector to ensure high transfer speeds.

## Troubleshooting
- Data changes in spreadsheets. If your sheet becomes empty, deleted, etc., its synchronization will be stopped. You will see the corresponding extended logs about this problem.
- Permissions. The connector cannot synchronize tables to which you do not have proper permissions.
- Data changes in spreadsheets. If your sheet is completely empty(no header rows) or deleted, Airbyte will stop attempting to sync it until it comes back. If this happens, the sync logs will contain a message saying the sheet has been skipped when syncing the full spreadsheet.

## Changelog

| Version | Date | Pull Request | Subject |
|---------|------------|----------------------------------------------------------|-----------------------------------------------------------------------------------|
| 0.3.6 | 2023-08-16 | | Update to latest CDK |
| 0.3.6 | 2023-08-16 | [29491](https://github.com/airbytehq/airbyte/pull/29491) | Update to latest CDK |
| 0.3.5 | 2023-08-16 | [29427](https://github.com/airbytehq/airbyte/pull/29427) | Add stop reading in case of 429 error |
| 0.3.4 | 2023-05-15 | [29453](https://github.com/airbytehq/airbyte/pull/29453) | Update spec descriptions |
| 0.3.3 | 2023-08-10 | [29327](https://github.com/airbytehq/airbyte/pull/29327) | Add user-friendly error message for 404 and 403 error while discover |
Expand Down

0 comments on commit cf210e6

Please sign in to comment.