From 7ffb6bf4b498b6b187eec33ccda9fda327b92e04 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Wed, 28 Jun 2023 15:24:03 +0800 Subject: [PATCH 01/30] chore: add v7.2 to dispatch.yml (#14013) * Add temp.md * Delete temp.md * Update dispatch.yml --- .github/workflows/dispatch.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/dispatch.yml b/.github/workflows/dispatch.yml index 8b740a8905100..10b0071ddcf30 100644 --- a/.github/workflows/dispatch.yml +++ b/.github/workflows/dispatch.yml @@ -6,6 +6,7 @@ on: - ".github/**" branches: - master + - release-7.2 - release-7.1 - release-7.0 - release-6.6 From 79863d603d7da2ffdea9d8baddc90311d9decfe4 Mon Sep 17 00:00:00 2001 From: fzzf678 <108643977+fzzf678@users.noreply.github.com> Date: Wed, 28 Jun 2023 15:46:08 +0800 Subject: [PATCH 02/30] sql: add switch for check constraint (#13998) --- constraints.md | 9 +++++++++ system-variables.md | 8 ++++++++ 2 files changed, 17 insertions(+) diff --git a/constraints.md b/constraints.md index 88035214731b3..dbfda1043c83d 100644 --- a/constraints.md +++ b/constraints.md @@ -52,6 +52,10 @@ Query OK, 1 row affected (0.03 sec) ## CHECK +> **Note:** +> +> The `CHECK` constraint feature is disabled by default. To enable it, you need to set the [`tidb_enable_check_constraint`](/system-variables.md#tidb_enable_check_constraint-new-in-v720) variable to `ON`. + A `CHECK` constraint restricts the values of a column in a table to meet your specified conditions. When the `CHECK` constraint is added to a table, TiDB checks whether the constraint is satisfied during the insertion or updates of data into the table. If the constraint is not met, an error is returned. The syntax for the `CHECK` constraint in TiDB is the same as that in MySQL: @@ -129,6 +133,11 @@ In addition to specifying `[NOT] ENFORCED` when adding the constraint, you can a ALTER TABLE t ALTER CONSTRAINT c1 NOT ENFORCED; ``` +### MySQL compatibility + +- It is not supported to add a `CHECK` constraint while adding a column (for example, `ALTER TABLE t ADD COLUMN a CHECK(a > 0)`). In this case, only the column is added successfully, and TiDB ignores the `CHECK` constraint without reporting any error. +- It is not supported to use `ALTER TABLE t CHANGE a b int CHECK(b > 0)` to add a `CHECK` constraint. When this statement is executed, TiDB reports an error. + ## UNIQUE KEY Unique constraints mean that all non-null values in a unique index and a primary key column are unique. diff --git a/system-variables.md b/system-variables.md index dd74d34608898..9a19da120822d 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1616,6 +1616,14 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1; - Default value: `OFF` - This variable is used to control whether to enable the cascades planner. +### tidb_enable_check_constraint New in v7.2.0 + +- Scope: GLOBAL +- Persists to cluster: Yes +- Type: Boolean +- Default value: `OFF` +- This variable is used to control whether to enable the [`CHECK` constraint](/constraints.md#check) feature. + ### tidb_enable_chunk_rpc New in v4.0 - Scope: SESSION From aafd0b84d0b9531b6f174208a0a81fdfe9b34104 Mon Sep 17 00:00:00 2001 From: Ran Date: Wed, 28 Jun 2023 16:59:39 +0800 Subject: [PATCH 03/30] add ticdc command line server flag description (#13915) --- ticdc/ticdc-sink-to-cloud-storage.md | 1 + ticdc/ticdc-sink-to-kafka.md | 1 + ticdc/ticdc-sink-to-mysql.md | 1 + 3 files changed, 3 insertions(+) diff --git a/ticdc/ticdc-sink-to-cloud-storage.md b/ticdc/ticdc-sink-to-cloud-storage.md index 8315b6bb57747..eda0e55e5b51f 100644 --- a/ticdc/ticdc-sink-to-cloud-storage.md +++ b/ticdc/ticdc-sink-to-cloud-storage.md @@ -27,6 +27,7 @@ The output is as follows: Info: {"upstream_id":7171388873935111376,"namespace":"default","id":"simple-replication-task","sink_uri":"s3://logbucket/storage_test?protocol=canal-json","create_time":"2022-11-29T18:52:05.566016967+08:00","start_ts":437706850431664129,"engine":"unified","config":{"case_sensitive":true,"enable_old_value":true,"force_replicate":false,"ignore_ineligible_table":false,"check_gc_safe_point":true,"enable_sync_point":false,"sync_point_interval":600000000000,"sync_point_retention":86400000000000,"filter":{"rules":["*.*"],"event_filters":null},"mounter":{"worker_num":16},"sink":{"protocol":"canal-json","schema_registry":"","csv":{"delimiter":",","quote":"\"","null":"\\N","include_commit_ts":false},"column_selectors":null,"transaction_atomicity":"none","encoder_concurrency":16,"terminator":"\r\n","date_separator":"none","enable_partition_separator":false},"consistent":{"level":"none","max_log_size":64,"flush_interval":2000,"storage":""}},"state":"normal","creator_version":"v6.5.0-master-dirty"} ``` +- `--server`: The address of any TiCDC server in the TiCDC cluster. - `--changefeed-id`: The ID of the changefeed. The format must match the `^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$` regular expression. If this ID is not specified, TiCDC automatically generates a UUID (the version 4 format) as the ID. - `--sink-uri`: The downstream address of the changefeed. For details, see [Configure sink URI](#configure-sink-uri). - `--start-ts`: The starting TSO of the changefeed. TiCDC starts pulling data from this TSO. The default value is the current time. diff --git a/ticdc/ticdc-sink-to-kafka.md b/ticdc/ticdc-sink-to-kafka.md index ab25ff59b6865..21b866c7fb0c3 100644 --- a/ticdc/ticdc-sink-to-kafka.md +++ b/ticdc/ticdc-sink-to-kafka.md @@ -24,6 +24,7 @@ ID: simple-replication-task Info: {"sink-uri":"kafka://127.0.0.1:9092/topic-name?protocol=canal-json&kafka-version=2.4.0&partition-num=6&max-message-bytes=67108864&replication-factor=1","opts":{},"create-time":"2020-03-12T22:04:08.103600025+08:00","start-ts":415241823337054209,"target-ts":0,"admin-job-type":0,"sort-engine":"unified","sort-dir":".","config":{"case-sensitive":true,"filter":{"rules":["*.*"],"ignore-txn-start-ts":null,"ddl-allow-list":null},"mounter":{"worker-num":16},"sink":{"dispatchers":null},"scheduler":{"type":"table-number","polling-time":-1}},"state":"normal","history":null,"error":null} ``` +- `--server`: The address of any TiCDC server in the TiCDC cluster. - `--changefeed-id`: The ID of the replication task. The format must match the `^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$` regular expression. If this ID is not specified, TiCDC automatically generates a UUID (the version 4 format) as the ID. - `--sink-uri`: The downstream address of the replication task. For details, see [Configure sink URI with `kafka`](#configure-sink-uri-for-kafka). - `--start-ts`: Specifies the starting TSO of the changefeed. From this TSO, the TiCDC cluster starts pulling data. The default value is the current time. diff --git a/ticdc/ticdc-sink-to-mysql.md b/ticdc/ticdc-sink-to-mysql.md index e02031c8779d4..9a53ec331ab29 100644 --- a/ticdc/ticdc-sink-to-mysql.md +++ b/ticdc/ticdc-sink-to-mysql.md @@ -24,6 +24,7 @@ ID: simple-replication-task Info: {"sink-uri":"mysql://root:123456@127.0.0.1:3306/","opts":{},"create-time":"2020-03-12T22:04:08.103600025+08:00","start-ts":415241823337054209,"target-ts":0,"admin-job-type":0,"sort-engine":"unified","sort-dir":".","config":{"case-sensitive":true,"filter":{"rules":["*.*"],"ignore-txn-start-ts":null,"ddl-allow-list":null},"mounter":{"worker-num":16},"sink":{"dispatchers":null},"scheduler":{"type":"table-number","polling-time":-1}},"state":"normal","history":null,"error":null} ``` +- `--server`: The address of any TiCDC server in the TiCDC cluster. - `--changefeed-id`: The ID of the replication task. The format must match the `^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$` regular expression. If this ID is not specified, TiCDC automatically generates a UUID (the version 4 format) as the ID. - `--sink-uri`: The downstream address of the replication task. For details, see [Configure sink URI with `mysql`/`tidb`](#configure-sink-uri-for-mysql-or-tidb). - `--start-ts`: Specifies the starting TSO of the changefeed. From this TSO, the TiCDC cluster starts pulling data. The default value is the current time. From 30b21751b3634ced707f989e30a0bf54d0455757 Mon Sep 17 00:00:00 2001 From: Yifan Xu <30385241+xuyifangreeneyes@users.noreply.github.com> Date: Wed, 28 Jun 2023 17:25:09 +0800 Subject: [PATCH 04/30] turn on lite-init-stats and force-init-stats by default (#13961) --- statistics.md | 8 ++------ tidb-configuration-file.md | 18 +++--------------- 2 files changed, 5 insertions(+), 21 deletions(-) diff --git a/statistics.md b/statistics.md index f3947801ccc5b..1c5ed0e9e225d 100644 --- a/statistics.md +++ b/statistics.md @@ -744,18 +744,14 @@ After enabling the synchronously loading statistics feature, you can further con - To specify the maximum number of columns that the synchronously loading statistics feature can process concurrently, modify the value of the [`stats-load-concurrency`](/tidb-configuration-file.md#stats-load-concurrency-new-in-v540) option in the TiDB configuration file. The default value is `5`. - To specify the maximum number of column requests that the synchronously loading statistics feature can cache, modify the value of the [`stats-load-queue-size`](/tidb-configuration-file.md#stats-load-queue-size-new-in-v540) option in the TiDB configuration file. The default value is `1000`. -During TiDB startup, SQL statements executed before the initial statistics are fully loaded might have suboptimal execution plans, thus causing performance issues. To avoid such issues, TiDB v7.1.0 introduces the configuration parameter [`force-init-stats`](/tidb-configuration-file.md#force-init-stats-new-in-v710). With this option, you can control whether TiDB provides services only after statistics initialization has been finished during startup. This parameter is disabled by default. - -> **Warning:** -> -> Lightweight statistics initialization is an experimental feature. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. +During TiDB startup, SQL statements executed before the initial statistics are fully loaded might have suboptimal execution plans, thus causing performance issues. To avoid such issues, TiDB v7.1.0 introduces the configuration parameter [`force-init-stats`](/tidb-configuration-file.md#force-init-stats-new-in-v710). With this option, you can control whether TiDB provides services only after statistics initialization has been finished during startup. Starting from v7.2.0, this parameter is enabled by default. Starting from v7.1.0, TiDB introduces [`lite-init-stats`](/tidb-configuration-file.md#lite-init-stats-new-in-v710) for lightweight statistics initialization. - When the value of `lite-init-stats` is `true`, statistics initialization does not load any histogram, TopN, or Count-Min Sketch of indexes or columns into memory. - When the value of `lite-init-stats` is `false`, statistics initialization loads histograms, TopN, and Count-Min Sketch of indexes and primary keys into memory but does not load any histogram, TopN, or Count-Min Sketch of non-primary key columns into memory. When the optimizer needs the histogram, TopN, and Count-Min Sketch of a specific index or column, the necessary statistics are loaded into memory synchronously or asynchronously. -The default value of `lite-init-stats` is `false`, which means to disable lightweight statistics initialization. Setting `lite-init-stats` to `true` speeds up statistics initialization and reduces TiDB memory usage by avoiding unnecessary statistics loading. +The default value of `lite-init-stats` is `true`, which means to enable lightweight statistics initialization. Setting `lite-init-stats` to `true` speeds up statistics initialization and reduces TiDB memory usage by avoiding unnecessary statistics loading. diff --git a/tidb-configuration-file.md b/tidb-configuration-file.md index eff71fd789d12..d9f86918002a3 100644 --- a/tidb-configuration-file.md +++ b/tidb-configuration-file.md @@ -545,40 +545,28 @@ Configuration items related to performance. ### `stats-load-concurrency` New in v5.4.0 -> **Warning:** -> -> Currently, synchronously loading statistics is an experimental feature. It is not recommended that you use it in production environments. - + The maximum number of columns that the TiDB synchronously loading statistics feature can process concurrently. + Default value: `5` + Currently, the valid value range is `[1, 128]`. ### `stats-load-queue-size` New in v5.4.0 -> **Warning:** -> -> Currently, synchronously loading statistics is an experimental feature. It is not recommended that you use it in production environments. - + The maximum number of column requests that the TiDB synchronously loading statistics feature can cache. + Default value: `1000` + Currently, the valid value range is `[1, 100000]`. ### `lite-init-stats` New in v7.1.0 -> **Warning:** -> -> This variable is an experimental feature. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. - + Controls whether to use lightweight statistics initialization during TiDB startup. -+ Default value: false ++ Default value: `false` for versions earlier than v7.2.0, `true` for v7.2.0 and later versions. + When the value of `lite-init-stats` is `true`, statistics initialization does not load any histogram, TopN, or Count-Min Sketch of indexes or columns into memory. When the value of `lite-init-stats` is `false`, statistics initialization loads histograms, TopN, and Count-Min Sketch of indexes and primary keys into memory but does not load any histogram, TopN, or Count-Min Sketch of non-primary key columns into memory. When the optimizer needs the histogram, TopN, and Count-Min Sketch of a specific index or column, the necessary statistics are loaded into memory synchronously or asynchronously (controlled by [`tidb_stats_load_sync_wait`](/system-variables.md#tidb_stats_load_sync_wait-new-in-v540)). + Setting `lite-init-stats` to `true` speeds up statistics initialization and reduces TiDB memory usage by avoiding unnecessary statistics loading. For details, see [Load statistics](/statistics.md#load-statistics). ### `force-init-stats` New in v7.1.0 + Controls whether to wait for statistics initialization to finish before providing services during TiDB startup. -+ Default value: false -+ When the value of `force-init-stats` is `true`, TiDB needs to wait until statistics initialization is finished before providing services upon startup. If there are a large number of tables and partitions, setting `force-init-stats` to `true` might prolong the time it takes for TiDB to start providing services. ++ Default value: `false` for versions earlier than v7.2.0, `true` for v7.2.0 and later versions. ++ When the value of `force-init-stats` is `true`, TiDB needs to wait until statistics initialization is finished before providing services upon startup. Note that if there are a large number of tables and partitions and the value of [`lite-init-stats`](/tidb-configuration-file.md#lite-init-stats-new-in-v710) is `false`, setting `force-init-stats` to `true` might prolong the time it takes for TiDB to start providing services. + When the value of `force-init-stats` is `false`, TiDB can still provide services before statistics initialization is finished, but the optimizer uses pseudo statistics to make decisions, which might result in suboptimal execution plans. ## opentracing From 1be833524dfa9cfb7eea330b631ff5cb0bad163c Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Wed, 28 Jun 2023 18:06:16 +0800 Subject: [PATCH 05/30] v7.2: Update the support info of optimize-filters-for-memory (#13930) --- releases/release-7.1.0.md | 2 -- tikv-configuration-file.md | 4 ++-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/releases/release-7.1.0.md b/releases/release-7.1.0.md index aee94e6c6a9fc..87e3d002b4164 100644 --- a/releases/release-7.1.0.md +++ b/releases/release-7.1.0.md @@ -348,8 +348,6 @@ Compared with the previous LTS 6.5.0, 7.1.0 not only includes new features, impr | TiDB | [`performance.force-init-stats`](/tidb-configuration-file.md#force-init-stats-new-in-v710) | Newly added | Controls whether to wait for statistics initialization to finish before providing services during TiDB startup. | | TiDB | [`performance.lite-init-stats`](/tidb-configuration-file.md#lite-init-stats-new-in-v710) | Newly added | Controls whether to use lightweight statistics initialization during TiDB startup. | | TiDB | [`log.timeout`](/tidb-configuration-file.md#timeout-new-in-v710) | Newly added | Sets the timeout for log-writing operations in TiDB. In case of a disk failure that prevents logs from being written, this configuration item can trigger the TiDB process to panic instead of hang. The default value is `0`, which means no timeout is set. | -| TiKV | [rocksdb.\[defaultcf\|writecf\|lockcf\].optimize-filters-for-memory](/tikv-configuration-file.md#optimize-filters-for-memory-new-in-v710) | Newly added | Controls whether to generate Bloom/Ribbon filters that minimize memory internal fragmentation. | -| TiKV | [rocksdb.\[defaultcf\|writecf\|lockcf\].ribbon-filter-above-level](/tikv-configuration-file.md#ribbon-filter-above-level-new-in-v710) | Newly added | Controls whether to use Ribbon filters for levels greater than or equal to this value and use non-block-based bloom filters for levels less than this value. | | TiKV | [`split.byte-threshold`](/tikv-configuration-file.md#byte-threshold-new-in-v50) | Modified | Changes the default value from `30MiB` to `100MiB` when [`region-split-size`](/tikv-configuration-file.md#region-split-size) is greater than or equal to 4 GB. | | TiKV | [`split.qps-threshold`](/tikv-configuration-file.md#qps-threshold) | Modified | Changes the default value from `3000` to `7000` when [`region-split-size`](/tikv-configuration-file.md#region-split-size) is greater than or equal to 4 GB. | | TiKV | [`split.region-cpu-overload-threshold-ratio`](/tikv-configuration-file.md#region-cpu-overload-threshold-ratio-new-in-v620) | Modified | Changes the default value from `0.25` to `0.75` when [`region-split-size`](/tikv-configuration-file.md#region-split-size) is greater than or equal to 4 GB. | diff --git a/tikv-configuration-file.md b/tikv-configuration-file.md index f18bc4174dc30..15ad4ddd32fbe 100644 --- a/tikv-configuration-file.md +++ b/tikv-configuration-file.md @@ -1348,7 +1348,7 @@ Configuration items related to `rocksdb.defaultcf`, `rocksdb.writecf`, and `rock + Default value for `defaultcf`: `true` + Default value for `writecf` and `lockcf`: `false` -### `optimize-filters-for-memory` New in v7.1.0 +### `optimize-filters-for-memory` New in v7.2.0 + Determines whether to generate Bloom/Ribbon filters that minimize memory internal fragmentation. + Note that this configuration item takes effect only when [`format-version`](#format-version-new-in-v620) >= 5. @@ -1371,7 +1371,7 @@ Configuration items related to `rocksdb.defaultcf`, `rocksdb.writecf`, and `rock + Determines whether each block creates a bloom filter + Default value: `false` -### `ribbon-filter-above-level` New in v7.1.0 +### `ribbon-filter-above-level` New in v7.2.0 + Determines whether to use Ribbon filters for levels greater than or equal to this value and use non-block-based bloom filters for levels less than this value. When this configuration item is set, [`block-based-bloom-filter`](#block-based-bloom-filter) will be ignored. + Note that this configuration item takes effect only when [`format-version`](#format-version-new-in-v620) >= 5. From ea65854aa2816b901f09b86dffbc87bb66daa093 Mon Sep 17 00:00:00 2001 From: Yuanjia Zhang Date: Wed, 28 Jun 2023 18:13:46 +0800 Subject: [PATCH 06/30] tidb: update the default value of variable `tidb_enable_non_prepared_plan_cache` (#13984) --- system-variables.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/system-variables.md b/system-variables.md index 9a19da120822d..9156570f093a1 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1792,7 +1792,7 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1; - Scope: SESSION | GLOBAL - Persists to cluster: Yes - Type: Boolean -- Default value: `OFF` +- Default value: `ON` - This variable controls whether to enable the [Non-prepared plan cache](/sql-non-prepared-plan-cache.md) feature. ### tidb_enable_non_prepared_plan_cache_for_dml New in v7.1.0 From 91a97ce0c35a660e28aadd44b7ba8fbe6134e429 Mon Sep 17 00:00:00 2001 From: Ran Date: Wed, 28 Jun 2023 19:22:47 +0800 Subject: [PATCH 07/30] ticdc: add ddl detail to bidirectional replication doc (#13913) --- ticdc/ticdc-bidirectional-replication.md | 34 +++++++++++++++++----- ticdc/ticdc-filter.md | 37 ++++++++++++++++++++++++ 2 files changed, 64 insertions(+), 7 deletions(-) diff --git a/ticdc/ticdc-bidirectional-replication.md b/ticdc/ticdc-bidirectional-replication.md index d010d67add19e..b098f5dbd815a 100644 --- a/ticdc/ticdc-bidirectional-replication.md +++ b/ticdc/ticdc-bidirectional-replication.md @@ -38,16 +38,36 @@ After the configuration takes effect, the clusters can perform bi-directional re ## Execute DDL -Bi-directional replication does not support replicating DDL statements. - -If you need to execute DDL statements, take the following steps: - -1. Pause the write operations in the tables that need to execute DDL in all clusters. If the DDL statement is adding a non-unique index, skip this step. +After the bidirectional replication is enabled, TiCDC does not replicate any DDL statements. You need to execute DDL statements in the upstream and downstream clusters respectively. + +Note that some DDL statements might cause table structure changes or data change time sequence problems, which might lead to data inconsistency after the replication. Therefore, after enabling bidirectional replication, only the DDL statements in the following table can be executed without stopping the write operations of the application. + +| Event | Does it cause changefeed errors | Note | +|---|---|---| +| create database | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. | +| drop database | Yes | You need to manually restart the changefeed and specify `--overwrite-checkpoint-ts` as the `commitTs` of the DDL statement to recover the errors. | +| create table | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. | +| drop table | Yes | You need to manually restart the changefeed and specify `--overwrite-checkpoint-ts` as the `commitTs` of the DDL statement to recover the errors. | +| alter table comment | No | | +| rename index | No | | +| alter table index visibility | No | | +| add partition | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. | +| drop partition | No | | +| create view | No | | +| drop view | No | | +| alter column default value | No | | +| reorganize partition | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. | +| alter table ttl | No | | +| alter table remove ttl | No | | +| add **not unique** index | No | | +| drop **not unique** index | No | | + +If you need to execute DDL statements that are not in the preceding table, take the following steps: + +1. Pause the write operations in the tables that need to execute DDL in all clusters. 2. After the write operations of the corresponding tables in all clusters have been replicated to other clusters, manually execute all DDL statements in each TiDB cluster. 3. After the DDL statements are executed, resume the write operations. -Note that a DDL statement that adds non-unique index does not break bi-directional replication, so you do not need to pause the write operations in the corresponding table. - ## Stop bi-directional replication After the application has stopped writing data, you can insert a special record into each cluster. By checking the two special records, you can make sure that data in two clusters are consistent. diff --git a/ticdc/ticdc-filter.md b/ticdc/ticdc-filter.md index 6ad1c17ec9250..2bed375030d28 100644 --- a/ticdc/ticdc-filter.md +++ b/ticdc/ticdc-filter.md @@ -87,3 +87,40 @@ Description of configuration parameters: > > - When TiDB updates a value in the column of the clustered index, TiDB splits an `UPDATE` event into a `DELETE` event and an `INSERT` event. TiCDC does not identify such events as an `UPDATE` event and thus cannot correctly filter out such events. > - When you configure a SQL expression, make sure all tables that matches `matcher` contain all the columns specified in the SQL expression. Otherwise, the replication task cannot be created. In addition, if the table schema changes during the replication, which results in a table no longer containing a required column, the replication task fails and cannot be resumed automatically. In such a situation, you must manually modify the configuration and resume the task. + +## DDL allow list + +Currently, TiCDC uses an allow list to replicate DDL statements. Only the DDL statements in the allow list are replicated to the downstream. The DDL statements not in the allow list are not replicated to the downstream. + +The allow list of DDL statements supported by TiCDC is as follows: + +- create database +- drop database +- create table +- drop table +- add column +- drop column +- create index / add index +- drop index +- truncate table +- modify column +- rename table +- alter column default value +- alter table comment +- rename index +- add partition +- drop partition +- truncate partition +- create view +- drop view +- alter table character set +- alter database character set +- recover table +- add primary key +- drop primary key +- rebase auto id +- alter table index visibility +- exchange partition +- reorganize partition +- alter table ttl +- alter table remove ttl From dc95dcba143c4f8e32c49e93dec90f5cb7272719 Mon Sep 17 00:00:00 2001 From: Yiding Cui Date: Wed, 28 Jun 2023 19:27:16 +0800 Subject: [PATCH 08/30] system variable: add the description for mpp cte shared scan (#14007) --- system-variables.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/system-variables.md b/system-variables.md index 9156570f093a1..5959fafc7ebfb 100644 --- a/system-variables.md +++ b/system-variables.md @@ -3510,6 +3510,18 @@ mysql> desc select count(distinct a) from test.t; - This variable is used to control whether to enable the [TiFlash late materialization](/tiflash/tiflash-late-materialization.md) feature. Note that TiFlash late materialization does not take effect in the [fast scan mode](/tiflash/use-fastscan.md). - When this variable is set to `OFF` to disable the TiFlash late materialization feature, to process a `SELECT` statement with filter conditions (`WHERE` clause), TiFlash scans all the data of the required columns before filtering. When this variable is set to `ON` to enable the TiFlash late materialization feature, TiFlash can first scan the column data related to the filter conditions that are pushed down to the TableScan operator, filter the rows that meet the conditions, and then scan the data of other columns of these rows for further calculations, thereby reducing IO scans and computations of data processing. +### tidb_opt_enable_mpp_shared_cte_execution New in v7.2.0 + +> **Warning:** +> +> The feature controlled by this variable is not fully functional in the current TiDB version. Do not change the default value. + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Type: Boolean +- Default value: `OFF` +- This variable controls whether the non-recursive [common table expressions (CTE)](/sql-statements/sql-statement-with.md) can be executed on TiFlash MPP instead of on TiDB. + ### tidb_opt_fix_control New in v7.1.0 From 2321db7de1045d3a9663eb2175a0d1c368bededb Mon Sep 17 00:00:00 2001 From: Lucas Date: Wed, 28 Jun 2023 20:31:45 +0800 Subject: [PATCH 09/30] rocksdb.*cf: supply descriptions on `ttl` and `periodic-compaction-seconds`. (#13980) --- tikv-configuration-file.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tikv-configuration-file.md b/tikv-configuration-file.md index 15ad4ddd32fbe..e08d0a15b29ea 100644 --- a/tikv-configuration-file.md +++ b/tikv-configuration-file.md @@ -1539,6 +1539,18 @@ Configuration items related to `rocksdb.defaultcf`, `rocksdb.writecf`, and `rock - `5`: Can be read by TiKV v6.1 and later versions. Full and partitioned filters use a faster and more accurate Bloom filter implementation with a different schema. + Default value: `2` +### `ttl` New in v7.2.0 + ++ SST files with updates older than the TTL will be automatically selected for compaction. These SST files will go through the compaction in a cascading way so that they can be compacted to the bottommost level or file. ++ Default value: `"30d"` ++ Unit: s(second)|h(hour)|d(day) + +### `periodic-compaction-seconds` New in v7.2.0 + ++ The time interval for periodic compaction. SST files with updates older than this value will be selected for compaction and rewritten to the same level where these SST files originally reside. ++ Default value: `"30d"` ++ Unit: s(second)|h(hour)|d(day) + ## rocksdb.defaultcf.titan Configuration items related to `rocksdb.defaultcf.titan`. From d3efec0b737057b9a570ff2b2f7d015db947acf5 Mon Sep 17 00:00:00 2001 From: Elsa <111482174+elsa0520@users.noreply.github.com> Date: Wed, 28 Jun 2023 20:33:16 +0800 Subject: [PATCH 10/30] add runtime filter session variable (#14024) --- system-variables.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/system-variables.md b/system-variables.md index 5959fafc7ebfb..e764639e3200b 100644 --- a/system-variables.md +++ b/system-variables.md @@ -4303,6 +4303,30 @@ SHOW WARNINGS; - If you upgrade from a TiDB version earlier than v4.0.0 to v4.0.0 or later versions, the format version is not changed, and TiDB continues to use the old format of version `1` to write data to the table, which means that **only newly created clusters use the new data format by default**. - Note that modifying this variable does not affect the old data that has been saved, but applies the corresponding version format only to the newly written data after modifying this variable. +### tidb_runtime_filter_mode New in v7.2.0 + +> **Warning:** +> +> The feature controlled by this variable is not fully functional in the current TiDB version. Do not change the default value. + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Type: Enumeration +- Default value: `OFF` +- Value options: `OFF`, `LOCAL` + +### tidb_runtime_filter_type New in v7.2.0 + +> **Warning:** +> +> The feature controlled by this variable is not fully functional in the current TiDB version. Do not change the default value. + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Type: Enumeration +- Default value: `IN` +- Value options: `IN` + ### tidb_scatter_region - Scope: GLOBAL From 39dd08d07b7da2c1898feeb26f0ef14d41219dd8 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Wed, 28 Jun 2023 23:37:45 +0800 Subject: [PATCH 11/30] data import: add IMPORT INTO stmt (#13928) --- TOC.md | 3 + error-codes.md | 42 ++- mysql-schema.md | 1 + privilege-management.md | 12 +- .../sql-statement-cancel-import-job.md | 48 ++++ sql-statements/sql-statement-import-into.md | 262 ++++++++++++++++++ .../sql-statement-show-import-job.md | 84 ++++++ system-variables.md | 3 +- tidb-configuration-file.md | 1 + tidb-distributed-execution-framework.md | 26 +- 10 files changed, 469 insertions(+), 13 deletions(-) create mode 100644 sql-statements/sql-statement-cancel-import-job.md create mode 100644 sql-statements/sql-statement-import-into.md create mode 100644 sql-statements/sql-statement-show-import-job.md diff --git a/TOC.md b/TOC.md index 7591f66b0230e..d5006cc5593fc 100644 --- a/TOC.md +++ b/TOC.md @@ -694,6 +694,7 @@ - [`BATCH`](/sql-statements/sql-statement-batch.md) - [`BEGIN`](/sql-statements/sql-statement-begin.md) - [`CALIBRATE RESOURCE`](/sql-statements/sql-statement-calibrate-resource.md) + - [`CANCEL IMPORT JOB`](/sql-statements/sql-statement-cancel-import-job.md) - [`CHANGE COLUMN`](/sql-statements/sql-statement-change-column.md) - [`COMMIT`](/sql-statements/sql-statement-commit.md) - [`CHANGE DRAINER`](/sql-statements/sql-statement-change-drainer.md) @@ -737,6 +738,7 @@ - [`FLUSH TABLES`](/sql-statements/sql-statement-flush-tables.md) - [`GRANT `](/sql-statements/sql-statement-grant-privileges.md) - [`GRANT `](/sql-statements/sql-statement-grant-role.md) + - [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md) - [`INSERT`](/sql-statements/sql-statement-insert.md) - [`KILL [TIDB]`](/sql-statements/sql-statement-kill.md) - [`LOAD DATA`](/sql-statements/sql-statement-load-data.md) @@ -783,6 +785,7 @@ - [`SHOW ERRORS`](/sql-statements/sql-statement-show-errors.md) - [`SHOW [FULL] FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) - [`SHOW GRANTS`](/sql-statements/sql-statement-show-grants.md) + - [`SHOW IMPORT JOB`](/sql-statements/sql-statement-show-import-job.md) - [`SHOW INDEX [FROM|IN]`](/sql-statements/sql-statement-show-index.md) - [`SHOW INDEXES [FROM|IN]`](/sql-statements/sql-statement-show-indexes.md) - [`SHOW KEYS [FROM|IN]`](/sql-statements/sql-statement-show-keys.md) diff --git a/error-codes.md b/error-codes.md index b96b4c78f0b5a..1b08664077024 100644 --- a/error-codes.md +++ b/error-codes.md @@ -364,23 +364,55 @@ TiDB is compatible with the error codes in MySQL, and in most cases returns the * Error Number: 8156 - The file path of the `LOAD DATA` statement cannot be empty. You need to set the correct path before importing. See [`LOAD DATA`](/sql-statements/sql-statement-load-data.md). + The provided path cannot be empty. You need to set a correct path before the import. + +* Error Number: 8157 + + The provided file format is unsupported. For the supported formats, see [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md#format). * Error Number: 8158 - The S3 or GCS path is invalid. See [external storage](/br/backup-and-restore-storages.md) to set a valid path. + The provided path is invalid. Refer to the specific error message for actions. For Amazon S3 or GCS path settings, see [External storage](/br/backup-and-restore-storages.md#uri-format). * Error Number: 8159 - TiDB cannot access the S3 or GCS path provided in the `LOAD DATA` statement. Make sure that the S3 or GCS bucket exists, and that you have used the correct access key and secret access key to let TiDB access the bucket. + TiDB cannot access the provided Amazon S3 or GCS path. Make sure that the specified S3 or GCS bucket exists and that you have provided the correct Access Key and Secret Access Key for TiDB to access the corresponding bucket. * Error Number: 8160 - `LOAD DATA` fails to read the data file. Refer to the specific error message for action. + Failed to read the data files. Refer to the specific error message for actions. * Error Number: 8162 - There is an error in the `LOAD DATA` statement. See [`LOAD DATA`](/sql-statements/sql-statement-load-data.md) for supported features. + There is an error in the statement. Refer to the specific error message for actions. + +* Error Number: 8163 + + The provided option is unknown. For supported options, see [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md#parameter-description). + +* Error Number: 8164 + + The provided option value is invalid. For valid values, see [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md#parameter-description). + +* Error Number: 8165 + + Duplicate options are specified. Each option can only be specified once. + +* Error Number: 8166 + + Certain options can only be used in specific conditions. Refer to the specific error message for actions. For supported options, see [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md#parameter-description). + +* Error Number: 8170 + + The specified job does not exist. + +* Error Number: 8171 + + The current operation cannot be performed for the current job status. Refer to the specific error message for actions. + +* Error Number: 8173 + + When executing `IMPORT INTO`, TiDB checks the current environment, such as checking if the downstream table is empty. Refer to the specific error message for actions. * Error Number: 8200 diff --git a/mysql-schema.md b/mysql-schema.md index be8a1b3c9647e..a8f31a8020c89 100644 --- a/mysql-schema.md +++ b/mysql-schema.md @@ -75,5 +75,6 @@ Currently, the `help_topic` is NULL. - `expr_pushdown_blacklist`: the blocklist for expression pushdown - `opt_rule_blacklist`: the blocklist for logical optimization rules - `table_cache_meta`: the metadata of cached tables +- `tidb_import_jobs`: the job information of [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md) diff --git a/privilege-management.md b/privilege-management.md index 3e7394838021f..2cc48fe20902d 100644 --- a/privilege-management.md +++ b/privilege-management.md @@ -271,6 +271,10 @@ mysql> SELECT * FROM INFORMATION_SCHEMA.USER_PRIVILEGES WHERE grantee = "'root'@ Requires the `SUPER` or `BACKUP_ADMIN` privilege. +### CANCEL IMPORT JOB + +Requires the `SUPER` privilege to cancel jobs created by other users. Otherwise, only jobs created by the current user can be canceled. + ### CREATE DATABASE Requires the `CREATE` privilege for the database. @@ -305,6 +309,10 @@ Requires the `INDEX` privilege for the table. Requires the `DROP` privilege for the table. +### IMPORT INTO + +Requires the `SELECT`, `UPDATE`, `INSERT`, `DELETE`, and `ALTER` privileges for the target table. To import files stored locally in TiDB, the `FILE` privilege is also required. + ### LOAD DATA Requires the `INSERT` privilege for the table. When you use `REPLACE INTO`, the `DELETE` privilege is also required. @@ -329,7 +337,9 @@ Requires the `INSERT` and `SELECT` privileges for the table. `SHOW GRANTS` requires the `SELECT` privilege to the `mysql` database. If the target user is current user, `SHOW GRANTS` does not require any privilege. -`SHOW PROCESSLIST` requires `SUPER` to show connections belonging to other users. +`SHOW PROCESSLIST` requires the `SUPER` privilege to show connections belonging to other users. + +`SHOW IMPORT JOB` requires the `SUPER` privilege to show connections belonging to other users. Otherwise, it only shows jobs created by the current user. ### CREATE ROLE/USER diff --git a/sql-statements/sql-statement-cancel-import-job.md b/sql-statements/sql-statement-cancel-import-job.md new file mode 100644 index 0000000000000..fe0c632b72dda --- /dev/null +++ b/sql-statements/sql-statement-cancel-import-job.md @@ -0,0 +1,48 @@ +--- +title: CANCEL IMPORT +summary: An overview of the usage of CANCEL IMPORT in TiDB. +--- + +# CANCEL IMPORT + +The `CANCEL IMPORT` statement is used to cancel a data import job created in TiDB. + + + +## Required privileges + +To cancel a data import job, you need to be the creator of the import job or have the `SUPER` privilege. + +## Synopsis + +```ebnf+diagram +CancelImportJobsStmt ::= + 'CANCEL' 'IMPORT' 'JOB' JobID +``` + +## Example + +To cancel an import job with the ID as `1`, execute the following statement: + +```sql +CANCEL IMPORT JOB 1; +``` + +The output is as follows: + +``` +Query OK, 0 rows affected (0.01 sec) +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md) +* [`SHOW IMPORT JOB`](/sql-statements/sql-statement-show-import-job.md) diff --git a/sql-statements/sql-statement-import-into.md b/sql-statements/sql-statement-import-into.md new file mode 100644 index 0000000000000..8ec0608f6fadc --- /dev/null +++ b/sql-statements/sql-statement-import-into.md @@ -0,0 +1,262 @@ +--- +title: IMPORT INTO +summary: An overview of the usage of IMPORT INTO in TiDB. +--- + +# IMPORT INTO + +The `IMPORT INTO` statement is used to import data in formats such as `CSV`, `SQL`, and `PARQUET` into an empty table in TiDB via the [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) of TiDB Lightning. + + + +> **Warning:** +> +> Currently, this statement is experimental. It is not recommended to use it in production environments. + +`IMPORT INTO` supports importing data from files stored in Amazon S3, GCS, and the TiDB local storage. + +- For data files stored in Amazon S3 or GCS, `IMPORT INTO` supports running in the [TiDB backend task distributed execution framework](/tidb-distributed-execution-framework.md). + + - When this framework is enabled ([tidb_enable_dist_task](/system-variables.md#tidb_enable_dist_task-new-in-v710) is `ON`), `IMPORT INTO` splits a data import job into multiple sub-jobs and distributes these sub-jobs to different TiDB nodes for execution to improve the import efficiency. + - When this framework is disabled, `IMPORT INTO` only supports running on the TiDB node where the current user is connected. + +- For data files stored locally in TiDB, `IMPORT INTO` only supports running on the TiDB node where the current user is connected. Therefore, the data files need to be placed on the TiDB node where the current user is connected. If you access TiDB through a proxy or load balancer, you cannot import data files stored locally in TiDB. + +## Restrictions + +- Currently, `IMPORT INTO` only supports importing data within 1 TiB. +- `IMPORT INTO` only supports importing data into existing empty tables in the database. +- `IMPORT INTO` does not support transactions or rollback. Executing `IMPORT INTO` within an explicit transaction (`BEGIN`/`END`) will return an error. +- The execution of `IMPORT INTO` blocks the current connection until the import is completed. To execute the statement asynchronously, you can add the `DETACHED` option. +- `IMPORT INTO` does not support working simultaneously with features such as [Backup & Restore](/br/backup-and-restore-overview.md), [`FLASHBACK CLUSTER TO TIMESTAMP`](/sql-statements/sql-statement-flashback-to-timestamp.md), [acceleration of adding indexes](/system-variables.md#tidb_ddl_enable_fast_reorg-new-in-v630), data import using TiDB Lightning, data replication using TiCDC, or [Point-in-Time Recovery (PITR)](/br/br-log-architecture.md). +- Only one `IMPORT INTO` job can run on a cluster at a time. Although `IMPORT INTO` performs a precheck for running jobs, it is not a hard limit. Starting multiple import jobs might work when multiple clients execute `IMPORT INTO` simultaneously, but you need to avoid that because it might result in data inconsistency or import failures. +- During the data import process, do not perform DDL or DML operations on the target table, and do not execute [`FLASHBACK DATABASE`](/sql-statements/sql-statement-flashback-database.md) for the target database. These operations can lead to import failures or data inconsistencies. In addition, it is **NOT** recommended to perform read operations during the import process, as the data being read might be inconsistent. Perform read and write operations only after the import is completed. +- The import process consumes system resources significantly. To get better performance, it is recommended to use TiDB nodes with at least 32 cores and 64 GiB of memory. TiDB writes sorted data to the TiDB [temporary directory](/tidb-configuration-file.md#temp-dir-new-in-v630) during import, so it is recommended to configure high-performance storage media such as flash memory. For more information, see [Physical Import Mode limitations](/tidb-lightning/tidb-lightning-physical-import-mode.md#requirements-and-restrictions). +- The TiDB [temporary directory](/tidb-configuration-file.md#temp-dir-new-in-v630) is expected to have at least 90 GiB of available space. It is recommended to allocate storage space that is equal to or greater than the volume of data to be imported. +- One import job supports importing data into one target table only. To import data into multiple target tables, after the import for a target table is completed, you need to create a new job for the next target table. + +## Prerequisites for import + +Before using `IMPORT INTO` to import data, make sure the following requirements are met: + +- The target table to be imported is already created in TiDB and it is empty. +- The target cluster has sufficient space to store the data to be imported. +- The [temporary directory](/tidb-configuration-file.md#temp-dir-new-in-v630) of the TiDB node connected to the current session has at least 90 GiB of available space. If [`tidb_enable_dist_task`](/system-variables.md#tidb_enable_dist_task-new-in-v710) is enabled, also make sure that the temporary directory of each TiDB node in the cluster has sufficient disk space. + +## Required privileges + +Executing `IMPORT INTO` requires the `SELECT`, `UPDATE`, `INSERT`, `DELETE`, and `ALTER` privileges on the target table. To import files in TiDB local storage, the `FILE` privilege is also required. + +## Synopsis + +```ebnf+diagram +ImportIntoStmt ::= + 'IMPORT' 'INTO' TableName ColumnNameOrUserVarList? SetClause? FROM fileLocation Format? WithOptions? + +ColumnNameOrUserVarList ::= + '(' ColumnNameOrUserVar (',' ColumnNameOrUserVar)* ')' + +SetClause ::= + 'SET' SetItem (',' SetItem)* + +SetItem ::= + ColumnName '=' Expr + +Format ::= + 'CSV' | 'SQL' | 'PARQUET' + +WithOptions ::= + 'WITH' OptionItem (',' OptionItem)* + +OptionItem ::= + optionName '=' optionVal | optionName +``` + +## Parameter description + +### ColumnNameOrUserVarList + +It specifies how each field in the data file corresponds to the columns in the target table. You can also use it to map fields to variables to skip certain fields for the import, or use it in `SetClause`. + +- If this parameter is not specified, the number of fields in each row of the data file must match the number of columns in the target table, and the fields will be imported to the corresponding columns in order. +- If this parameter is specified, the number of specified columns or variables must match the number of fields in each row of the data file. + +### SetClause + +It specifies how the values of target columns are calculated. In the right side of the `SET` expression, you can reference the variables specified in `ColumnNameOrUserVarList`. + +In the left side of the `SET` expression, you can only reference a column name that is not included in `ColumnNameOrUserVarList`. If the target column name already exists in `ColumnNameOrUserVarList`, the `SET` expression is invalid. + +### fileLocation + +It specifies the storage location of the data file, which can be an Amazon S3 or GCS URI path, or a TiDB local file path. + +- Amazon S3 or GCS URI path: for URI configuration details, see [External storage](/br/backup-and-restore-storages.md#uri-format). +- TiDB local file path: it must be an absolute path, and the file extension must be `.csv`, `.sql`, or `.parquet`. Make sure that the files corresponding to this path are stored on the TiDB node connected by the current user, and the user has the `FILE` privilege. + +> **Note:** +> +> If [SEM](/system-variables.md#tidb_enable_enhanced_security) is enabled in the target cluster, the `fileLocation` cannot be specified as a local file path. + +In the `fileLocation` parameter, you can specify a single file or use the `*` wildcard to match multiple files for import. Note that the wildcard can only be used in the file name, because it does not match directories or recursively match files in subdirectories. Taking files stored on Amazon S3 as examples, you can configure the parameter as follows: + +- Import a single file: `s3:///path/to/data/foo.csv` +- Import all files in a specified path: `s3:///path/to/data/*` +- Import all files with the `.csv` suffix in a specified path: `s3:///path/to/data/*.csv` +- Import all files with the `foo` prefix in a specified path: `s3:///path/to/data/foo*` +- Import all files with the `foo` prefix and the `.csv` suffix in a specified path: `s3:///path/to/data/foo*.csv` + +### Format + +The `IMPORT INTO` statement supports three data file formats: `CSV`, `SQL`, and `PARQUET`. If not specified, the default format is `CSV`. + +### WithOptions + +You can use `WithOptions` to specify import options and control the data import process. For example, to execute the import asynchronously in the backend, you can enable the `DETACHED` mode for the import by adding the `WITH DETACHED` option to the `IMPORT INTO` statement. + +The supported options are described as follows: + +| Option name | Supported data formats | Description | +|:---|:---|:---| +| `CHARACTER_SET=''` | CSV | Specifies the character set of the data file. The default character set is `utf8mb4`. The supported character sets include `binary`, `utf8`, `utf8mb4`, `gb18030`, `gbk`, `latin1`, and `ascii`. | +| `FIELDS_TERMINATED_BY=''` | CSV | Specifies the field separator. The default separator is `,`. | +| `FIELDS_ENCLOSED_BY=''` | CSV | Specifies the field delimiter. The default delimiter is `"`. | +| `FIELDS_ESCAPED_BY=''` | CSV | Specifies the escape character for fields. The default escape character is `\`. | +| `FIELDS_DEFINED_NULL_BY=''` | CSV | Specifies the value that represents `NULL` in the fields. The default value is `\N`. | +| `LINES_TERMINATED_BY=''` | CSV | Specifies the line terminator. By default, `IMPORT INTO` automatically identifies `\n`, `\r`, or `\r\n` as line terminators. If the line terminator is one of these three, you do not need to explicitly specify this option. | +| `SKIP_ROWS=` | CSV | Specifies the number of rows to skip. The default value is `0`. You can use this option to skip the header in a CSV file. If you use a wildcard to specify the source files for import, this option applies to all source files that are matched by the wildcard in `fileLocation`. | +| `DISK_QUOTA=''` | All formats | Specifies the disk space threshold that can be used during data sorting. The default value is 80% of the disk space in the TiDB [temporary directory](/tidb-configuration-file.md#temp-dir-new-in-v630). If the total disk size cannot be obtained, the default value is 50 GiB. When specifying `DISK_QUOTA` explicitly, make sure that the value does not exceed 80% of the disk space in the TiDB temporary directory. | +| `DISABLE_TIKV_IMPORT_MODE` | All formats | Specifies whether to disable switching TiKV to import mode during the import process. By default, switching TiKV to import mode is not disabled. If there are ongoing read-write operations in the cluster, you can enable this option to avoid impact from the import process. | +| `THREAD=` | All formats | Specifies the concurrency for import. The default value is 50% of the CPU cores, with a minimum value of 1. You can explicitly specify this option to control the resource usage, but make sure that the value does not exceed the number of CPU cores. To import data into a new cluster without any data, it is recommended to increase this concurrency appropriately to improve import performance. If the target cluster is already used in a production environment, it is recommended to adjust this concurrency according to your application requirements. | +| `MAX_WRITE_SPEED=''` | All formats | Controls the write speed to a TiKV node. By default, there is no speed limit. For example, you can specify this option as `1MiB` to limit the write speed to 1 MiB/s. | +| `CHECKSUM_TABLE=''` | All formats | Configures whether to perform a checksum check on the target table after the import to validate the import integrity. The supported values include `"required"` (default), `"optional"`, and `"off"`. `"required"` means performing a checksum check after the import. If the checksum check fails, TiDB will return an error and the import will exit. `"optional"` means performing a checksum check after the import. If an error occurs, TiDB will return a warning and ignore the error. `"off"` means not performing a checksum check after the import. | +| `DETACHED` | All Formats | Controls whether to execute `IMPORT INTO` asynchronously. When this option is enabled, executing `IMPORT INTO` immediately returns the information of the import job (such as the `Job_ID`), and the job is executed asynchronously in the backend. | + +## Output + +When `IMPORT INTO` completes the import or when the `DETACHED` mode is enabled, `IMPORT INTO` will return the current job information in the output, as shown in the following examples. For the description of each field, see [`SHOW IMPORT JOB(s)`](/sql-statements/sql-statement-show-import-job.md). + +When `IMPORT INTO` completes the import, the example output is as follows: + +```sql +IMPORT INTO t FROM '/path/to/small.csv'; ++--------+--------------------+--------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+ +| Job_ID | Data_Source | Target_Table | Table_ID | Phase | Status | Source_File_Size | Imported_Rows | Result_Message | Create_Time | Start_Time | End_Time | Created_By | ++--------+--------------------+--------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+ +| 60002 | /path/to/small.csv | `test`.`t` | 363 | | finished | 16B | 2 | | 2023-06-08 16:01:22.095698 | 2023-06-08 16:01:22.394418 | 2023-06-08 16:01:26.531821 | root@% | ++--------+--------------------+--------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+ +``` + +When the `DETACHED` mode is enabled, executing the `IMPORT INTO` statement will immediately return the job information in the output. From the output, you can see that the status of the job is `pending`, which means waiting for execution. + +```sql +IMPORT INTO t FROM '/path/to/small.csv' WITH DETACHED; ++--------+--------------------+--------------+----------+-------+---------+------------------+---------------+----------------+----------------------------+------------+----------+------------+ +| Job_ID | Data_Source | Target_Table | Table_ID | Phase | Status | Source_File_Size | Imported_Rows | Result_Message | Create_Time | Start_Time | End_Time | Created_By | ++--------+--------------------+--------------+----------+-------+---------+------------------+---------------+----------------+----------------------------+------------+----------+------------+ +| 60001 | /path/to/small.csv | `test`.`t` | 361 | | pending | 16B | NULL | | 2023-06-08 15:59:37.047703 | NULL | NULL | root@% | ++--------+--------------------+--------------+----------+-------+---------+------------------+---------------+----------------+----------------------------+------------+----------+------------+ +``` + +## View and manage import jobs + +For an import job with the `DETACHED` mode enabled, you can use [`SHOW IMPORT`](/sql-statements/sql-statement-show-import-job.md) to view its current job progress. + +After an import job is started, you can cancel it using [`CANCEL IMPORT JOB `](/sql-statements/sql-statement-cancel-import-job.md). + +## Examples + +### Import a CSV file with headers + +```sql +IMPORT INTO t FROM '/path/to/file.csv' WITH skip_rows=1; +``` + +### Import a file asynchronously in the `DETACHED` mode + +```sql +IMPORT INTO t FROM '/path/to/file.csv' WITH DETACHED; +``` + +### Skip importing a specific field in your data file + +Assume that your data file is in the CSV format and its content is as follows: + +``` +id,name,age +1,Tom,23 +2,Jack,44 +``` + +And assume that the target table schema for the import is `CREATE TABLE t(id int primary key, name varchar(100))`. To skip importing the `age` field in the data file to the table `t`, you can execute the following SQL statement: + +```sql +IMPORT INTO t(id, name, @1) FROM '/path/to/file.csv' WITH skip_rows=1; +``` + +### Import multiple data files using the wildcard `*` + +Assume that there are three files named `file-01.csv`, `file-02.csv`, and `file-03.csv` in the `/path/to/` directory. To import these three files into a target table `t` using `IMPORT INTO`, you can execute the following SQL statement: + +```sql +IMPORT INTO t FROM '/path/to/file-*.csv' +``` + +### Import data files from Amazon S3 or GCS + +- Import data files from Amazon S3: + + ```sql + IMPORT INTO t FROM 's3://bucket-name/test.csv?access-key=XXX&secret-access-key=XXX'; + ``` + +- Import data files from GCS: + + ```sql + IMPORT INTO t FROM 'gs://bucket-name/test.csv'; + ``` + +For details about the URI path configuration for Amazon S3 or GCS, see [External storage](/br/backup-and-restore-storages.md#uri-format). + +### Calculate column values using SetClause + +Assume that your data file is in the CSV format and its content is as follows: + +``` +id,name,val +1,phone,230 +2,book,440 +``` + +And assume that the target table schema for the import is `CREATE TABLE t(id int primary key, name varchar(100), val int)`. If you want to multiply the `val` column values by 100 during the import, you can execute the following SQL statement: + +```sql +IMPORT INTO t(id, name, @1) SET val=@1*100 FROM '/path/to/file.csv' WITH skip_rows=1; +``` + +### Import a data file in the SQL format + +```sql +IMPORT INTO t FROM '/path/to/file.sql' FORMAT 'sql'; +``` + +### Limit the write speed to TiKV + +To limit the write speed to a TiKV node to 10 MiB/s, execute the following SQL statement: + +```sql +IMPORT INTO t FROM 's3://bucket/path/to/file.parquet?access-key=XXX&secret-access-key=XXX' FORMAT 'parquet' WITH MAX_WRITE_SPEED='10MiB'; +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [`SHOW IMPORT JOB(s)`](/sql-statements/sql-statement-show-import-job.md) +* [`CANCEL IMPORT JOB`](/sql-statements/sql-statement-cancel-import-job.md) diff --git a/sql-statements/sql-statement-show-import-job.md b/sql-statements/sql-statement-show-import-job.md new file mode 100644 index 0000000000000..f392edb3df1c2 --- /dev/null +++ b/sql-statements/sql-statement-show-import-job.md @@ -0,0 +1,84 @@ +--- +title: SHOW IMPORT +summary: An overview of the usage of SHOW IMPORT in TiDB. +--- + +# SHOW IMPORT + +The `SHOW IMPORT` statement is used to show the IMPORT jobs created in TiDB. This statement can only show jobs created by the current user. + + + +## Required privileges + +- `SHOW IMPORT JOBS`: if a user has the `SUPER` privilege, this statement shows all import jobs in TiDB. Otherwise, this statement only shows jobs created by the current user. +- `SHOW IMPORT JOB `: only the creator of an import job or users with the `SUPER` privilege can use this statement to view a specific job. + +## Synopsis + +```ebnf+diagram +ShowImportJobsStmt ::= + 'SHOW' 'IMPORT' 'JOBS' + +ShowImportJobStmt ::= + 'SHOW' 'IMPORT' 'JOB' JobID +``` + +The output fields of the `SHOW IMPORT` statement are described as follows: + +| Column | Description | +|------------------|-------------------------| +| Job_ID | The ID of the task | +| Data_Source | Information about the data source | +| Target_Table | The name of the target table | +| Phase | The current phase of the job, including `importing`, `validating`, and `add-index` | +| Status | The current status of the job, including `pending` (means created but not started yet), `running`, `canceled`, `failed`, and `finished` | +| Source_File_Size | The size of the source file | +| Imported_Rows | The number of data rows that have been read and written to the target table | +| Result_Message | If the import fails, this field returns the error message. Otherwise, it is empty.| +| Create_Time | The time when the task is created | +| Start_Time | The time when the task is started | +| End_Time | The time when the task is ended | +| Created_By | The name of the database user who creates the task | + +## Example + +```sql +SHOW IMPORT JOBS; +``` + +``` ++--------+-------------------+--------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+ +| Job_ID | Data_Source | Target_Table | Table_ID | Phase | Status | Source_File_Size | Imported_Rows | Result_Message | Create_Time | Start_Time | End_Time | Created_By | ++--------+-------------------+--------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+ +| 1 | /path/to/file.csv | `test`.`foo` | 116 | | finished | 11GB | 950000 | | 2023-06-26 11:23:59.281257 | 2023-06-26 11:23:59.484932 | 2023-06-26 13:04:30.622952 | root@% | +| 2 | /path/to/file.csv | `test`.`bar` | 130 | | finished | 1.194TB | 49995000 | | 2023-06-26 15:42:45.079237 | 2023-06-26 15:42:45.388108 | 2023-06-26 17:29:43.023568 | root@% | ++--------+-------------------+--------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+ +1 row in set (0.01 sec) +``` + +```sql +SHOW IMPORT JOB 60001; +``` + +``` ++--------+--------------------+--------------+----------+-------+---------+------------------+---------------+----------------+----------------------------+------------+----------+------------+ +| Job_ID | Data_Source | Target_Table | Table_ID | Phase | Status | Source_File_Size | Imported_Rows | Result_Message | Create_Time | Start_Time | End_Time | Created_By | ++--------+--------------------+--------------+----------+-------+---------+------------------+---------------+----------------+----------------------------+------------+----------+------------+ +| 60001 | /path/to/small.csv | `test`.`t` | 361 | | pending | 16B | NULL | | 2023-06-08 15:59:37.047703 | NULL | NULL | root@% | ++--------+--------------------+--------------+----------+-------+---------+------------------+---------------+----------------+----------------------------+------------+----------+------------+ +1 row in set (0.01 sec) +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md) +* [`CANCEL IMPORT JOB`](/sql-statements/sql-statement-cancel-import-job.md) diff --git a/system-variables.md b/system-variables.md index e764639e3200b..05b788e8dbda4 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1410,7 +1410,8 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1; - Persists to cluster: Yes - Default value: `OFF` - This variable is used to control whether to enable the [TiDB backend task distributed execution framework](/tidb-distributed-execution-framework.md). After the framework is enabled, backend tasks such as DDL and import will be distributedly executed and completed by multiple TiDB nodes in the cluster. -- In TiDB v7.1.0, the framework supports distributedly executing only the `ADD INDEX` statement for partitioned tables. +- Starting from TiDB v7.1.0, the framework supports distributedly executing the [`ADD INDEX`](/sql-statements/sql-statement-add-index.md) statement for partitioned tables. +- Starting from TiDB v7.2.0, the framework supports distributedly executing the [`IMPORT INTO`](https://docs.pingcap.com/tidb/v7.2/sql-statement-import-into) statement for import jobs of TiDB Self-Hosted. For TiDB Cloud, the `IMPORT INTO` statement is not applicable. - This variable is renamed from `tidb_ddl_distribute_reorg`. ### tidb_ddl_error_count_limit diff --git a/tidb-configuration-file.md b/tidb-configuration-file.md index d9f86918002a3..0f5525bb20982 100644 --- a/tidb-configuration-file.md +++ b/tidb-configuration-file.md @@ -48,6 +48,7 @@ The TiDB configuration file supports more options than command-line parameters. + File system location used by TiDB to store temporary data. If a feature requires local storage in TiDB nodes, TiDB stores the corresponding temporary data in this location. + When creating an index, if [`tidb_ddl_enable_fast_reorg`](/system-variables.md#tidb_ddl_enable_fast_reorg-new-in-v630) is enabled, data that needs to be backfilled for a newly created index will be at first stored in the TiDB local temporary directory, and then imported into TiKV in batches, thus accelerating the index creation. ++ When [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md) is used to import data, the sorted data is first stored in the TiDB local temporary directory, and then imported into TiKV in batches. + Default value: `"/tmp/tidb"` ### `oom-use-tmp-storage` diff --git a/tidb-distributed-execution-framework.md b/tidb-distributed-execution-framework.md index 3678837d58308..5ca6a483ac1ac 100644 --- a/tidb-distributed-execution-framework.md +++ b/tidb-distributed-execution-framework.md @@ -27,7 +27,17 @@ This document describes the use cases, limitations, usage, and implementation pr ## Use cases and limitations -In a database management system, in addition to the core transactional processing (TP) and analytical processing (AP) workloads, there are other important tasks, such as DDL operations, Load Data, TTL, Analyze, and Backup/Restore, which are called **backend tasks**. These backend tasks need to process a large amount of data in database objects (tables), so they typically have the following characteristics: + + +In a database management system, in addition to the core transactional processing (TP) and analytical processing (AP) workloads, there are other important tasks, such as DDL operations, IMPORT INTO, TTL, Analyze, and Backup/Restore, which are called **backend tasks**. These backend tasks need to process a large amount of data in database objects (tables), so they typically have the following characteristics: + + + + + +In a database management system, in addition to the core transactional processing (TP) and analytical processing (AP) workloads, there are other important tasks, such as DDL operations, TTL, Analyze, and Backup/Restore, which are called **backend tasks**. These backend tasks need to process a large amount of data in database objects (tables), so they typically have the following characteristics: + + - Need to process all data in a schema or a database object (table). - Might need to be executed periodically, but at a low frequency. @@ -39,12 +49,16 @@ Enabling the TiDB backend task distributed execution framework can solve the abo - The framework supports distributed execution of backend tasks, which can flexibly schedule the available computing resources of the entire TiDB cluster, thereby better utilizing the computing resources in a TiDB cluster. - The framework provides unified resource usage and management capabilities for both overall and individual backend tasks. -Currently, the TiDB backend task distributed execution framework only supports the distributed execution of `ADD INDEX` statements, that is, the DDL statements for creating indexes. For example, the following SQL statements are supported: +Currently, for TiDB Self-Hosted, the TiDB backend task distributed execution framework supports the distributed execution of the `ADD INDEX` and `IMPORT INTO` statements. For TiDB Cloud, the `IMPORT INTO` statement is not applicable. + +- `ADD INDEX` is a DDL statement used to create indexes. For example: + + ```sql + ALTER TABLE t1 ADD INDEX idx1(c1); + CREATE INDEX idx1 ON table t1(c1); + ``` -```sql -ALTER TABLE t1 ADD INDEX idx1(c1); -CREATE INDEX idx1 ON table t1(c1); -``` +- `IMPORT INTO` is used to import data in formats such as `CSV`, `SQL`, and `PARQUET` into an empty table. For more information, see [`IMPORT INTO`](https://docs.pingcap.com/tidb/v7.2/sql-statement-import-into). ## Prerequisites From 5347157e2aa42beb33a3f4b0570a0110e30fef12 Mon Sep 17 00:00:00 2001 From: niubell Date: Thu, 29 Jun 2023 11:16:43 +0800 Subject: [PATCH 12/30] tidb-lightning: add 50TB data import best practices doc (#13921) --- TOC.md | 1 + tidb-lightning/data-import-best-practices.md | 155 +++++++++++++++++++ 2 files changed, 156 insertions(+) create mode 100644 tidb-lightning/data-import-best-practices.md diff --git a/TOC.md b/TOC.md index d5006cc5593fc..38551e75bd240 100644 --- a/TOC.md +++ b/TOC.md @@ -116,6 +116,7 @@ - Migrate - [Overview](/migration-overview.md) - [Migration Tools](/migration-tools.md) + - [Import Best Practices](/tidb-lightning/data-import-best-practices.md) - Migration Scenarios - [Migrate from Aurora](/migrate-aurora-to-tidb.md) - [Migrate MySQL of Small Datasets](/migrate-small-mysql-to-tidb.md) diff --git a/tidb-lightning/data-import-best-practices.md b/tidb-lightning/data-import-best-practices.md new file mode 100644 index 0000000000000..3841a59cdf2a9 --- /dev/null +++ b/tidb-lightning/data-import-best-practices.md @@ -0,0 +1,155 @@ +--- +title: Best Practices for Importing 50 TiB Data +summary: Learn best practices for importing large volumes of data. +--- + +# Best Practices for Importing 50 TiB Data + +This document provides best practices for importing large volumes of data into TiDB, including some key factors and steps that affect data import. We have successfully imported data of a large single table over 50 TiB into both the internal environment and customer's environment, and have accumulated best practices based on these real application scenarios, which can help you import data more smoothly and efficiently. + +TiDB Lightning ([Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md)) is a comprehensive and efficient data import tool used for importing data into empty tables and initializing empty clusters, and uses files as the data source. TiDB Lightning provides two running modes: a single instance and [parallel import](/tidb-lightning/tidb-lightning-distributed-import.md). You can import source files of different sizes. + +- If the data size of the source files is within 10 TiB, it is recommended to use a single instance of TiDB Lightning for the import. +- If the data size of the source files exceeds 10 TiB, it is recommended to use multiple instances of TiDB Lightning for [Parallel Import](/tidb-lightning/tidb-lightning-distributed-import.md). +- If the source file data scale is exceptionally large (larger than 50 TiB), in addition to parallel importing, you need to make certain preparations and optimizations based on the characteristics of the source data, table definitions, and parameter configurations to achieve smoother and faster large-scale data import. + +The following sections apply to both importing multiple tables and importing large single tables: + +- [Key factors](#key-factors) +- [Prepare source files](#prepare-source-files) +- [Estimate storage space](#estimate-storage-space) +- [Change configuration parameters](#change-configuration-parameters) +- [Resolve the "checksum mismatch" error](#resolve-the-checksum-mismatch-error) +- [Enable checkpoint](#enable-checkpoint) +- [Troubleshooting](#troubleshooting) + +The best practices for importing large single tables are described separately in the following section because of its special requirements: + +- [Best practices for importing a large single table](#best-practices-for-importing-a-large-single-table) + +## Key factors + +When you import data, some key factors can affect import performance and might even cause import to fail. Some common critical factors are as follows: + +- Source files + + - Whether the data within a single file is sorted by the primary key. Sorted data can achieve optimal import performance. + - Whether overlapping primary keys or non-null unique indexes exist between source files imported by multiple TiDB Lightning instances. The smaller the overlap is, the better the import performance. + +- Table definitions + + - The number and size of secondary indexes per table can affect the import speed. Fewer indexes result in faster imports and less space consumption after import. + - Index data size = Number of indexes \* Index size \* Number of rows. + +- Compression ratio + + - Data imported into a TiDB cluster is stored in a compressed format. The compression ratio cannot be calculated in advance. It can only be determined after the data is actually imported into the TiKV cluster. + - As a best practice, you can first import a small portion of the data (for example, 10%) to obtain the corresponding compression ratio of the cluster, and then use it to estimate the compression ratio of the entire data import. + +- Configuration parameters + + - `region-concurrency`: The concurrency of TiDB Lightning main logical processing. + - `send-kv-pairs`: The number of Key-Value pairs sent by TiDB Lightning to TiKV in a single request. + - `disk-quota`: The disk quota used by TiDB Lightning local temp files when using the physical import mode. + - `GOMEMLIMIT`: TiDB Lightning is implemented in the Go language. [Configure `GOMEMLIMIT` properly.](#change-configuration-parameters) + +- Data validation + + After data and index import is completed, the [`ADMIN CHECKSUM`](/sql-statements/sql-statement-admin-checksum-table.md) statement is executed on each table, and the checksum value is compared with the local checksum value of TiDB Lightning. When many tables exist, or an individual table has a large number of rows, the checksum phase can take a long time. + +- Execution plan + + After the checksum is successfully completed, the [`ANALYZE TABLE`](/sql-statements/sql-statement-analyze-table.md) statement is executed on each table to generate the optimal execution plan. The [`ANALYZE TABLE`](/sql-statements/sql-statement-analyze-table.md) operation can be time-consuming when dealing with a large number of tables or an individual table with a significant amount of data. + +- Relevant issues + + During the actual process of importing 50 TiB of data, certain issues might occur that are only exposed when dealing with a massive number of source files and large-scale clusters. When choosing a product version, it is recommended to check whether the corresponding issues have been fixed. + + The following issues have been resolved in v6.5.3, v7.1.0, and later versions: + + - [Issue-14745](https://github.com/tikv/tikv/issues/14745): After the import is completed, a large number of temporary files are left in the TiKV import directory. + - [Issue-6426](https://github.com/tikv/pd/issues/6426): The PD [range scheduling](/tidb-lightning/tidb-lightning-physical-import-mode-usage.md#scope-of-pausing-scheduling-during-import) interface might fail to scatter regions, resulting in timeout issues. Before v6.2.0, global scheduling is disabled by default, which can avoid triggering this problem. + - [Issue-43079](https://github.com/pingcap/tidb/pull/43079): TiDB Lightning fails to refresh the Region Peers information during retry for NotLeader errors. + - [Issue-43291](https://github.com/pingcap/tidb/issues/43291): TiDB Lightning does not retry in cases where temporary files are not found (the "No such file or directory" error). + +## Prepare source files + +- When generating source files, it is preferable to sort them by the primary key within a single file. If the table definition does not have a primary key, you can add an auto-increment primary key. In this case, the order of the file content does not matter. +- When assigning source files to multiple TiDB Lightning instances, try to avoid the situation where overlapping primary keys or non-null unique indexes exist between multiple source files. If the generated files are globally sorted, they can be distributed into different TiDB Lightning instances based on ranges to achieve optimal import performance. +- Control each file to be less than 96 MiB in size during file generation. +- If a file is exceptionally large and exceeds 256 MiB, enable [`strict-format`](/migrate-from-csv-files-to-tidb.md#step-4-tune-the-import-performance-optional). + +## Estimate storage space + +You can use either of the following two methods to estimate the storage space required for importing data: + +- Assuming the total data size is **A**, the total index size is **B**, the replication factor is **3**, and the compression ratio is **α** (typically around 2.5), the overall occupied space can be calculated as: **(A+B)\*3/α**. This method is primarily used for estimating without performing any data import, to plan the cluster topology. +- Import only 10% of the data and multiply the actual occupied space by 10 to estimate the final space usage for that batch of data. This method is more accurate, especially when you import a large amount of data. + +Note that it is recommended to reserve 20% of storage space, because background tasks such as compaction and snapshot replication also consume a portion of the storage space. + +## Change configuration parameters + +- `region-concurrency`: The concurrency of TiDB Lightning main logical processing. During parallel importing, it is recommended to set it to 75% of the CPU cores to prevent resource overload and potential OOM issues. +- `send-kv-pairs`: The number of Key-Value pairs sent by TiDB Lightning to TiKV in a single request. It is recommended to adjust this value based on the formula send-kv-pairs \* row-size < 1 MiB. Starting from v7.2.0, this parameter is replaced by `send-kv-size`, and no additional setting is required. +- `disk-quota`: It is recommended to ensure that the sorting directory space of TiDB Lightning is larger than the size of the data source. If you cannot ensure that, you can set `disk-quota` to 80% of the sorting directory space of TiDB Lightning. In this way, TiDB Lightning will sort and write data in batches according to the specified `disk-quota`, but note that this approach might result in lower import performance compared to a complete sorting process. +- `GOMEMLIMIT`: TiDB Lightning is implemented in the Go language. Setting `GOMEMLIMIT` to 80% of the instance memory to reduce the probability of OOM caused by the Go GC mechanism. + +For more information about TiDB Lightning parameters, see [TiDB Lightning configuration parameters](/tidb-lightning/tidb-lightning-configuration.md). + +## Resolve the "checksum mismatch" error + +Conflicts might occur during data validation. The error message is "checksum mismatch". To resolve this issue, take the following steps as needed: + +1. In the source data, check for conflicted primary keys or unique keys, and resolve the conflicts before reimporting. In most cases, this is the most common cause. +2. Check if the table primary key or unique key definition is reasonable. If not, modify the table definition and reimport data. +3. If the issue persists after following the preceding two steps, further examination is required to determine whether a small amount (less than 10%) of unexpected conflicting data exists in the source data. To let TiDB Lightning detect and resolve conflicting data, enable [conflict detection](/tidb-lightning/tidb-lightning-physical-import-mode-usage.md#conflict-detection). + +## Enable checkpoint + +For importing a large volume of data, it is essential to refer to [Lightning Checkpoints](/tidb-lightning/tidb-lightning-checkpoints.md) and enable checkpoints. It is recommended to prioritize using MySQL as the driver to avoid losing the checkpoint information if TiDB Lightning is running in a container environment where the container might exit and delete the checkpoint information. + +If you encounter insufficient space in downstream TiKV during import, you can manually run the `kill` command (without the `-9` option) on all TiDB Lightning instances. After scaling up the capacity, you can resume the import based on the checkpoint information. + +## Best practices for importing a large single table + +Importing multiple tables can increase the time required for checksum and analyze operations, sometimes exceeding the time required for data import itself. However, it is generally not necessary to adjust the configuration. If one or more large tables exist among the multiple tables, it is recommended to separate the source files of these large tables and import them separately. + +This section provides the best practices for importing large single tables. There is no strict definition for a large single table, but it is generally considered to meet one of the following criteria: + +- The table size exceeds 10 TiB. +- The number of rows exceeds 1 billion and the number of columns exceeds 50 in a wide table. + +### Generate source files + +Follow the steps outlined in the [Prepare source files](#prepare-source-files). + +For a large single table, if global sorting is not achievable but sorting within each file based on the primary key is possible, and the file is a standard CSV file, it is recommended to generate large single files with each around 20 GiB. + +Then, enable `strict-format`. This approach reduces the overlap of primary and unique keys in the imported files between TiDB Lightning instances, and TiDB Lightning instances can split the large files before importing to achieve optimal import performance. + +### Plan cluster topology + +Prepare TiDB Lightning instances to make each instance process 5 TiB to 10 TiB of source data. Deploy one TiDB Lightning instance on each node. For the specifications of the nodes, refer to the [environment requirements](/tidb-lightning/tidb-lightning-physical-import-mode.md#environment-requirements) of TiDB Lightning instances. + +### Change configuration parameters + +- Set `region-concurrency` to 75% of the number of cores of the TiDB Lightning instance. +- Set `send-kv-pairs` to `3200`. This method applies to TiDB v7.1.0 and earlier versions. Starting from v7.2.0, this parameter is replaced by `send-kv-size`, and no additional setting is required. +- Adjust `GOMEMLIMIT` to 80% of the memory on the node where the instance is located. + +If the PD Scatter Region latency during the import process exceeds 30 minutes, consider the following optimizations: + +- Check whether the TiKV cluster encounters any I/O bottlenecks. +- Increase TiKV `raftstore.apply-pool-size` from the default value of `2` to `4` or `8`. +- Reduce TiDB Lightning `region-split-concurrency` to half the number of CPU cores, with a minimum value of `1`. + +### Disable the execution plan + +In the case of a large single table (for example, with over 1 billion rows and more than 50 columns), it is recommended to disable the `analyze` operation (`analyze="off"`) during the import process, and manually execute the [`ANALYZE TABLE`](/sql-statements//sql-statement-analyze-table.md) statement after the import is completed. + +For more information about the configuration of `analyze`, see [TiDB Lightning task configuration](/tidb-lightning/tidb-lightning-configuration.md#tidb-lightning-task). + +## Troubleshooting + +If you encounter problems while using TiDB Lightning, see [Troubleshoot TiDB Lightning](/tidb-lightning/troubleshoot-tidb-lightning.md). From 5d7902ebe679277e5b8ab5b9a7406ffbc56f2a24 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Thu, 29 Jun 2023 14:07:43 +0800 Subject: [PATCH 13/30] TiDB features: add the v7.2 column (#13884) --- basic-features.md | 379 +++++++++++++++++++++++----------------------- 1 file changed, 192 insertions(+), 187 deletions(-) diff --git a/basic-features.md b/basic-features.md index 6065d1fc84fd6..49d791332e033 100644 --- a/basic-features.md +++ b/basic-features.md @@ -22,222 +22,227 @@ You can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?u ## Data types, functions, and operators -| Data types, functions, and operators | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| [Numeric types](/data-type-numeric.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Date and time types](/data-type-date-and-time.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [String types](/data-type-string.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [JSON type](/data-type-json.md) | Y | Y | E | E | E | E | E | E | E | -| [Control flow functions](/functions-and-operators/control-flow-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [String functions](/functions-and-operators/string-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Numeric functions and operators](/functions-and-operators/numeric-functions-and-operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Date and time functions](/functions-and-operators/date-and-time-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Bit functions and operators](/functions-and-operators/bit-functions-and-operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Cast functions and operators](/functions-and-operators/cast-functions-and-operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Encryption and compression functions](/functions-and-operators/encryption-and-compression-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Information functions](/functions-and-operators/information-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [JSON functions](/functions-and-operators/json-functions.md) | Y | Y | E | E | E | E | E | E | E | -| [Aggregation functions](/functions-and-operators/aggregate-group-by-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Window functions](/functions-and-operators/window-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Miscellaneous functions](/functions-and-operators/miscellaneous-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Operators](/functions-and-operators/operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Character sets and collations](/character-set-and-collation.md) [^1] | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [User-level lock](/functions-and-operators/locking-functions.md) | Y | Y | Y | N | N | N | N | N | N | +| Data types, functions, and operators | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| [Numeric types](/data-type-numeric.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Date and time types](/data-type-date-and-time.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [String types](/data-type-string.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [JSON type](/data-type-json.md) | Y | Y | Y | E | E | E | E | E | E | E | +| [Control flow functions](/functions-and-operators/control-flow-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [String functions](/functions-and-operators/string-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Numeric functions and operators](/functions-and-operators/numeric-functions-and-operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Date and time functions](/functions-and-operators/date-and-time-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Bit functions and operators](/functions-and-operators/bit-functions-and-operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Cast functions and operators](/functions-and-operators/cast-functions-and-operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Encryption and compression functions](/functions-and-operators/encryption-and-compression-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Information functions](/functions-and-operators/information-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [JSON functions](/functions-and-operators/json-functions.md) | Y | Y | Y | E | E | E | E | E | E | E | +| [Aggregation functions](/functions-and-operators/aggregate-group-by-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Window functions](/functions-and-operators/window-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Miscellaneous functions](/functions-and-operators/miscellaneous-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Operators](/functions-and-operators/operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Character sets and collations](/character-set-and-collation.md) [^1] | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [User-level lock](/functions-and-operators/locking-functions.md) | Y | Y | Y | Y | N | N | N | N | N | N | ## Indexing and constraints -| Indexing and constraints | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| [Expression indexes](/sql-statements/sql-statement-create-index.md#expression-index) [^2] | Y | Y | E | E | E | E | E | E | E | -| [Columnar storage (TiFlash)](/tiflash/tiflash-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Use FastScan to accelerate queries in OLAP scenarios](/tiflash/use-fastscan.md) | Y | E | N | N | N | N | N | N | N | -| [RocksDB engine](/storage-engine/rocksdb-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Titan plugin](/storage-engine/titan-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Titan Level Merge](/storage-engine/titan-configuration.md#level-merge-experimental) | E | E | E | E | E | E | E | E | E | -| [Use buckets to improve scan concurrency](/tune-region-performance.md#use-bucket-to-increase-concurrency) | E | E | E | N | N | N | N | N | N | -| [Invisible indexes](/sql-statements/sql-statement-add-index.md) | Y | Y | Y | Y | Y | Y | Y | Y | N | -| [Composite `PRIMARY KEY`](/constraints.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Unique indexes](/constraints.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Clustered index on integer `PRIMARY KEY`](/constraints.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Clustered index on composite or non-integer key](/constraints.md) | Y | Y | Y | Y | Y | Y | Y | Y | N | -| [Multi-valued index](/sql-statements/sql-statement-create-index.md#multi-valued-index) | Y | N | N | N | N | N | N | N | N | -| [Foreign key](/constraints.md#foreign-key) | Y | N | N | N | N | N | N | N | N | -| [TiFlash late materialization](/tiflash/tiflash-late-materialization.md) | Y | N | N | N | N | N | N | N | N | +| Indexing and constraints | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| [Expression indexes](/sql-statements/sql-statement-create-index.md#expression-index) [^2] | Y | Y | Y | E | E | E | E | E | E | E | +| [Columnar storage (TiFlash)](/tiflash/tiflash-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Use FastScan to accelerate queries in OLAP scenarios](/tiflash/use-fastscan.md) | Y | Y | E | N | N | N | N | N | N | N | +| [RocksDB engine](/storage-engine/rocksdb-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Titan plugin](/storage-engine/titan-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Titan Level Merge](/storage-engine/titan-configuration.md#level-merge-experimental) | E | E | E | E | E | E | E | E | E | E | +| [Use buckets to improve scan concurrency](/tune-region-performance.md#use-bucket-to-increase-concurrency) | E | E | E | E | N | N | N | N | N | N | +| [Invisible indexes](/sql-statements/sql-statement-add-index.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | +| [Composite `PRIMARY KEY`](/constraints.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [`CHECK` constraints](/constraints.md#check) | Y | N | N | N | N | N | N | N | N | N | +| [Unique indexes](/constraints.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Clustered index on integer `PRIMARY KEY`](/constraints.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Clustered index on composite or non-integer key](/constraints.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | +| [Multi-valued index](/sql-statements/sql-statement-create-index.md#multi-valued-index) | Y | Y | N | N | N | N | N | N | N | N | +| [Foreign key](/constraints.md#foreign-key) | Y | Y | N | N | N | N | N | N | N | N | +| [TiFlash late materialization](/tiflash/tiflash-late-materialization.md) | Y | Y | N | N | N | N | N | N | N | N | ## SQL statements -| SQL statements [^3] | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| Basic `SELECT`, `INSERT`, `UPDATE`, `DELETE`, `REPLACE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| `INSERT ON DUPLICATE KEY UPDATE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| `LOAD DATA INFILE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| `SELECT INTO OUTFILE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| `INNER JOIN`, LEFT\|RIGHT [OUTER] JOIN | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| `UNION`, `UNION ALL` | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [`EXCEPT` and `INTERSECT` operators](/functions-and-operators/set-operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | N | -| `GROUP BY`, `ORDER BY` | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Window Functions](/functions-and-operators/window-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Common Table Expressions (CTE)](/sql-statements/sql-statement-with.md) | Y | Y | Y | Y | Y | Y | Y | N | N | -| `START TRANSACTION`, `COMMIT`, `ROLLBACK` | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [`EXPLAIN`](/sql-statements/sql-statement-explain.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [`EXPLAIN ANALYZE`](/sql-statements/sql-statement-explain-analyze.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [User-defined variables](/user-defined-variables.md) | E | E | E | E | E | E | E | E | E | -| [`BATCH [ON COLUMN] LIMIT INTEGER DELETE`](/sql-statements/sql-statement-batch.md) | Y | Y | Y | N | N | N | N | N | N | -| [`BATCH [ON COLUMN] LIMIT INTEGER INSERT/UPDATE/REPLACE`](/sql-statements/sql-statement-batch.md) | Y | Y | N | N | N | N | N | N | N | -| [`ALTER TABLE ... COMPACT`](/sql-statements/sql-statement-alter-table-compact.md) | Y | Y | E | N | N | N | N | N | N | -| [Table Lock](/sql-statements/sql-statement-lock-tables-and-unlock-tables.md) | E | E | E | E | E | E | E | E | E | -| [TiFlash Query Result Materialization](/tiflash/tiflash-results-materialization.md) | Y | E | N | N | N | N | N | N | N | +| SQL statements [^3] | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| Basic `SELECT`, `INSERT`, `UPDATE`, `DELETE`, `REPLACE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| `INSERT ON DUPLICATE KEY UPDATE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| `LOAD DATA INFILE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| `SELECT INTO OUTFILE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| `INNER JOIN`, LEFT\|RIGHT [OUTER] JOIN | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| `UNION`, `UNION ALL` | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [`EXCEPT` and `INTERSECT` operators](/functions-and-operators/set-operators.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | +| `GROUP BY`, `ORDER BY` | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Window Functions](/functions-and-operators/window-functions.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Common Table Expressions (CTE)](/sql-statements/sql-statement-with.md) | Y | Y | Y | Y | Y | Y | Y | Y | N | N | +| `START TRANSACTION`, `COMMIT`, `ROLLBACK` | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [`EXPLAIN`](/sql-statements/sql-statement-explain.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [`EXPLAIN ANALYZE`](/sql-statements/sql-statement-explain-analyze.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [User-defined variables](/user-defined-variables.md) | E | E | E | E | E | E | E | E | E | E | +| [`BATCH [ON COLUMN] LIMIT INTEGER DELETE`](/sql-statements/sql-statement-batch.md) | Y | Y | Y | Y | N | N | N | N | N | N | +| [`BATCH [ON COLUMN] LIMIT INTEGER INSERT/UPDATE/REPLACE`](/sql-statements/sql-statement-batch.md) | Y | Y | Y | N | N | N | N | N | N | N | +| [`ALTER TABLE ... COMPACT`](/sql-statements/sql-statement-alter-table-compact.md) | Y | Y | Y | E | N | N | N | N | N | N | +| [Table Lock](/sql-statements/sql-statement-lock-tables-and-unlock-tables.md) | E | E | E | E | E | E | E | E | E | E | +| [TiFlash Query Result Materialization](/tiflash/tiflash-results-materialization.md) | Y | Y | E | N | N | N | N | N | N | N | ## Advanced SQL features -| Advanced SQL features | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| [Prepared statement cache](/sql-prepared-plan-cache.md) | Y | Y | Y | Y | Y | E | E | E | E | -| [Non-prepared statement cache](/sql-non-prepared-plan-cache.md) | E | N | N | N | N | N | N | N | N | -| [SQL plan management (SPM)](/sql-plan-management.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Create bindings according to historical execution plans](/sql-plan-management.md#create-a-binding-according-to-a-historical-execution-plan) | Y | E | N | N | N | N | N | N | N | -| [Coprocessor cache](/coprocessor-cache.md) | Y | Y | Y | Y | Y | Y | Y | Y | E | -| [Stale Read](/stale-read.md) | Y | Y | Y | Y | Y | Y | Y | N | N | -| [Follower reads](/follower-read.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Read historical data (tidb_snapshot)](/read-historical-data.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Optimizer hints](/optimizer-hints.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [MPP execution engine](/explain-mpp.md) | Y | Y | Y | Y | Y | Y | Y | Y | N | -| [MPP execution engine - compression exchange](/explain-mpp.md#mpp-version-and-exchange-data-compression) | Y | N | N | N | N | N | N | N | N | -| [Index Merge](/explain-index-merge.md) | Y | Y | Y | Y | E | E | E | E | E | -| [Placement Rules in SQL](/placement-rules-in-sql.md) | Y | Y | Y | E | E | N | N | N | N | -| [Cascades Planner](/system-variables.md#tidb_enable_cascades_planner) | E | E | E | E | E | E | E | E | E | +| Advanced SQL features | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| [Prepared statement cache](/sql-prepared-plan-cache.md) | Y | Y | Y | Y | Y | Y | E | E | E | E | +| [Non-prepared statement cache](/sql-non-prepared-plan-cache.md) | E | E | N | N | N | N | N | N | N | N | +| [SQL plan management (SPM)](/sql-plan-management.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Create bindings according to historical execution plans](/sql-plan-management.md#create-a-binding-according-to-a-historical-execution-plan) | Y | Y | E | N | N | N | N | N | N | N | +| [Coprocessor cache](/coprocessor-cache.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | E | +| [Stale Read](/stale-read.md) | Y | Y | Y | Y | Y | Y | Y | Y | N | N | +| [Follower reads](/follower-read.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Read historical data (tidb_snapshot)](/read-historical-data.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Optimizer hints](/optimizer-hints.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [MPP execution engine](/explain-mpp.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | +| [MPP execution engine - compression exchange](/explain-mpp.md#mpp-version-and-exchange-data-compression) | Y | Y | N | N | N | N | N | N | N | N | +| [TiFlash Pipeline Model](/tiflash/tiflash-pipeline-model.md) | E | N | N | N | N | N | N | N | N | N | +| [Index Merge](/explain-index-merge.md) | Y | Y | Y | Y | Y | E | E | E | E | E | +| [Placement Rules in SQL](/placement-rules-in-sql.md) | Y | Y | Y | Y | E | E | N | N | N | N | +| [Cascades Planner](/system-variables.md#tidb_enable_cascades_planner) | E | E | E | E | E | E | E | E | E | E | ## Data definition language (DDL) -| Data definition language (DDL) | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| Basic `CREATE`, `DROP`, `ALTER`, `RENAME`, `TRUNCATE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Generated columns](/generated-columns.md) | Y | E | E | E | E | E | E | E | E | -| [Views](/views.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Sequences](/sql-statements/sql-statement-create-sequence.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Auto increment](/auto-increment.md) | Y | Y[^4] | Y | Y | Y | Y | Y | Y | Y | -| [Auto random](/auto-random.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [TTL (Time to Live)](/time-to-live.md) | Y | E | N | N | N | N | N | N | N | -| [DDL algorithm assertions](/sql-statements/sql-statement-alter-table.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| Multi-schema change: add columns | Y | Y | E | E | E | E | E | E | E | -| [Change column type](/sql-statements/sql-statement-modify-column.md) | Y | Y | Y | Y | Y | Y | Y | N | N | -| [Temporary tables](/temporary-tables.md) | Y | Y | Y | Y | Y | N | N | N | N | -| Concurrent DDL statements | Y | Y | N | N | N | N | N | N | N | -| [Acceleration of `ADD INDEX` and `CREATE INDEX`](/system-variables.md#tidb_ddl_enable_fast_reorg-new-in-v630) | Y | Y | N | N | N | N | N | N | N | -| [Metadata lock](/metadata-lock.md) | Y | Y | N | N | N | N | N | N | N | -| [`FLASHBACK CLUSTER TO TIMESTAMP`](/sql-statements/sql-statement-flashback-to-timestamp.md) | Y | Y | N | N | N | N | N | N | N | +| Data definition language (DDL) | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| Basic `CREATE`, `DROP`, `ALTER`, `RENAME`, `TRUNCATE` | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Generated columns](/generated-columns.md) | Y | Y | E | E | E | E | E | E | E | E | +| [Views](/views.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Sequences](/sql-statements/sql-statement-create-sequence.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Auto increment](/auto-increment.md) | Y | Y | Y[^4] | Y | Y | Y | Y | Y | Y | Y | +| [Auto random](/auto-random.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [TTL (Time to Live)](/time-to-live.md) | Y | Y | E | N | N | N | N | N | N | N | +| [DDL algorithm assertions](/sql-statements/sql-statement-alter-table.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| Multi-schema change: add columns | Y | Y | Y | E | E | E | E | E | E | E | +| [Change column type](/sql-statements/sql-statement-modify-column.md) | Y | Y | Y | Y | Y | Y | Y | Y | N | N | +| [Temporary tables](/temporary-tables.md) | Y | Y | Y | Y | Y | Y | N | N | N | N | +| Concurrent DDL statements | Y | Y | Y | N | N | N | N | N | N | N | +| [Acceleration of `ADD INDEX` and `CREATE INDEX`](/system-variables.md#tidb_ddl_enable_fast_reorg-new-in-v630) | Y | Y | Y | N | N | N | N | N | N | N | +| [Metadata lock](/metadata-lock.md) | Y | Y | Y | N | N | N | N | N | N | N | +| [`FLASHBACK CLUSTER TO TIMESTAMP`](/sql-statements/sql-statement-flashback-to-timestamp.md) | Y | Y | Y | N | N | N | N | N | N | N | +| [Pause](/sql-statements/sql-statement-admin-pause-ddl.md)/[Resume](/sql-statements/sql-statement-admin-resume-ddl.md) DDL | E | N | N | N | N | N | N | N | N | N | ## Transactions -| Transactions | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| [Async commit](/system-variables.md#tidb_enable_async_commit-new-in-v50) | Y | Y | Y | Y | Y | Y | Y | Y | N | -| [1PC](/system-variables.md#tidb_enable_1pc-new-in-v50) | Y | Y | Y | Y | Y | Y | Y | Y | N | -| [Large transactions (10GB)](/transaction-overview.md#transaction-size-limit) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Pessimistic transactions](/pessimistic-transaction.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Optimistic transactions](/optimistic-transaction.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Repeatable-read isolation (snapshot isolation)](/transaction-isolation-levels.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Read-committed isolation](/transaction-isolation-levels.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| Transactions | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| [Async commit](/system-variables.md#tidb_enable_async_commit-new-in-v50) | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | +| [1PC](/system-variables.md#tidb_enable_1pc-new-in-v50) | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | +| [Large transactions (10GB)](/transaction-overview.md#transaction-size-limit) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Pessimistic transactions](/pessimistic-transaction.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Optimistic transactions](/optimistic-transaction.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Repeatable-read isolation (snapshot isolation)](/transaction-isolation-levels.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Read-committed isolation](/transaction-isolation-levels.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | ## Partitioning -| Partitioning | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| [Range partitioning](/partitioned-table.md#range-partitioning) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Hash partitioning](/partitioned-table.md#hash-partitioning) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Key partitioning](/partitioned-table.md#key-partitioning) | Y | N | N | N | N | N | N | N | N | -| [List partitioning](/partitioned-table.md#list-partitioning) | Y | Y | Y | E | E | E | E | E | N | -| [List COLUMNS partitioning](/partitioned-table.md) | Y | Y | Y | E | E | E | E | E | N | -| [`EXCHANGE PARTITION`](/partitioned-table.md) | Y | Y | E | E | E | E | E | E | N | -| [`REORGANIZE PARTITION`](/partitioned-table.md#reorganize-partitions) | Y | N | N | N | N | N | N | N | N | -| [`COALESCE PARTITION`](/partitioned-table.md#decrease-the-number-of-partitions) | Y | N | N | N | N | N | N | N | N | -| [Dynamic pruning](/partitioned-table.md#dynamic-pruning-mode) | Y | Y | Y | E | E | E | E | N | N | -| [Range COLUMNS partitioning](/partitioned-table.md#range-columns-partitioning) | Y | Y | N | N | N | N | N | N | N | -| [Range INTERVAL partitioning](/partitioned-table.md#range-interval-partitioning) | Y | E | N | N | N | N | N | N | N | +| Partitioning | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| [Range partitioning](/partitioned-table.md#range-partitioning) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Hash partitioning](/partitioned-table.md#hash-partitioning) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Key partitioning](/partitioned-table.md#key-partitioning) | Y | Y | N | N | N | N | N | N | N | N | +| [List partitioning](/partitioned-table.md#list-partitioning) | Y | Y | Y | Y | E | E | E | E | E | N | +| [List COLUMNS partitioning](/partitioned-table.md) | Y | Y | Y | Y | E | E | E | E | E | N | +| [`EXCHANGE PARTITION`](/partitioned-table.md) | Y | Y | Y | E | E | E | E | E | E | N | +| [`REORGANIZE PARTITION`](/partitioned-table.md#reorganize-partitions) | Y | Y | N | N | N | N | N | N | N | N | +| [`COALESCE PARTITION`](/partitioned-table.md#decrease-the-number-of-partitions) | Y | Y | N | N | N | N | N | N | N | N | +| [Dynamic pruning](/partitioned-table.md#dynamic-pruning-mode) | Y | Y | Y | Y | E | E | E | E | N | N | +| [Range COLUMNS partitioning](/partitioned-table.md#range-columns-partitioning) | Y | Y | Y | N | N | N | N | N | N | N | +| [Range INTERVAL partitioning](/partitioned-table.md#range-interval-partitioning) | Y | Y | E | N | N | N | N | N | N | N | ## Statistics -| Statistics | 7.1 | 6.5 | 6.1 | 6.0 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| [CMSketch](/statistics.md) | Disabled by default | Disabled by default | Disabled by default | Disabled by default | Disabled by default | Disabled by default | Y | Y | Y | -| [Histograms](/statistics.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Extended statistics](/extended-statistics.md) | E | E | E | E | E | E | E | E | E | -| Statistics feedback | N | N | Deprecated | Deprecated | Deprecated | E | E | E | E | -| [Automatically update statistics](/statistics.md#automatic-update) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Fast Analyze](/system-variables.md#tidb_enable_fast_analyze) | E | E | E | E | E | E | E | E | E | -| [Dynamic pruning](/partitioned-table.md#dynamic-pruning-mode) | Y | Y | Y | E | E | E | E | E | N | -| [Collect statistics for `PREDICATE COLUMNS`](/statistics.md#collect-statistics-on-some-columns) | E | E | E | E | E | N | N | N | N | -| [Control the memory quota for collecting statistics](/statistics.md#the-memory-quota-for-collecting-statistics) | E | E | E | N | N | N | N | N | N | -| [Randomly sample about 10000 rows of data to quickly build statistics](/system-variables.md#tidb_enable_fast_analyze) | E | E | E | E | E | E | E | E | E | -| [Lock statistics](/statistics.md#lock-statistics) | E | E | N | N | N | N | N | N | N | -| [Lightweight statistics initialization](/statistics.md#load-statistics) | E | N | N | N | N | N | N | N | N | +| Statistics | 7.2 | 7.1 | 6.5 | 6.1 | 6.0 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| [CMSketch](/statistics.md) | Disabled by default | Disabled by default | Disabled by default | Disabled by default | Disabled by default | Disabled by default | Disabled by default | Y | Y | Y | +| [Histograms](/statistics.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Extended statistics](/extended-statistics.md) | E | E | E | E | E | E | E | E | E | E | +| Statistics feedback | N | N | N | Deprecated | Deprecated | Deprecated | E | E | E | E | +| [Automatically update statistics](/statistics.md#automatic-update) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Fast Analyze](/system-variables.md#tidb_enable_fast_analyze) | E | E | E | E | E | E | E | E | E | E | +| [Dynamic pruning](/partitioned-table.md#dynamic-pruning-mode) | Y | Y | Y | Y | E | E | E | E | E | N | +| [Collect statistics for `PREDICATE COLUMNS`](/statistics.md#collect-statistics-on-some-columns) | E | E | E | E | E | E | N | N | N | N | +| [Control the memory quota for collecting statistics](/statistics.md#the-memory-quota-for-collecting-statistics) | E | E | E | E | N | N | N | N | N | N | +| [Randomly sample about 10000 rows of data to quickly build statistics](/system-variables.md#tidb_enable_fast_analyze) | E | E | E | E | E | E | E | E | E | E | +| [Lock statistics](/statistics.md#lock-statistics) | E | E | E | N | N | N | N | N | N | N | +| [Lightweight statistics initialization](/statistics.md#load-statistics) | Y | E | N | N | N | N | N | N | N | N | ## Security -| Security | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| [Transparent layer security (TLS)](/enable-tls-between-clients-and-servers.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Encryption at rest (TDE)](/encryption-at-rest.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Role-based authentication (RBAC)](/role-based-access-control.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Certificate-based authentication](/certificate-authentication.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [`caching_sha2_password` authentication](/system-variables.md#default_authentication_plugin) | Y | Y | Y | Y | Y | Y | N | N | N | -| [`tidb_sm3_password` authentication](/system-variables.md#default_authentication_plugin) | Y | Y | N | N | N | N | N | N | N | -| [`tidb_auth_token` authentication](/system-variables.md#default_authentication_plugin) | Y | Y | N | N | N | N | N | N | N | -| [`authentication_ldap_sasl` authentication](/system-variables.md#default_authentication_plugin) | Y | N | N | N | N | N | N | N | N | -| [`authentication_ldap_simple` authentication](/system-variables.md#default_authentication_plugin) | Y | N | N | N | N | N | N | N | N | -| [Password management](/password-management.md) | Y | Y | N | N | N | N | N | N | N | -| [MySQL compatible `GRANT` system](/privilege-management.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Dynamic Privileges](/privilege-management.md#dynamic-privileges) | Y | Y | Y | Y | Y | Y | Y | N | N | -| [Security Enhanced Mode](/system-variables.md#tidb_enable_enhanced_security) | Y | Y | Y | Y | Y | Y | Y | N | N | -| [Redacted Log Files](/log-redaction.md) | Y | Y | Y | Y | Y | Y | Y | Y | N | +| Security | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| [Transparent layer security (TLS)](/enable-tls-between-clients-and-servers.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Encryption at rest (TDE)](/encryption-at-rest.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Role-based authentication (RBAC)](/role-based-access-control.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Certificate-based authentication](/certificate-authentication.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [`caching_sha2_password` authentication](/system-variables.md#default_authentication_plugin) | Y | Y | Y | Y | Y | Y | Y | N | N | N | +| [`tidb_sm3_password` authentication](/system-variables.md#default_authentication_plugin) | Y | Y | Y | N | N | N | N | N | N | N | +| [`tidb_auth_token` authentication](/system-variables.md#default_authentication_plugin) | Y | Y | Y | N | N | N | N | N | N | N | +| [`authentication_ldap_sasl` authentication](/system-variables.md#default_authentication_plugin) | Y | Y | N | N | N | N | N | N | N | N | +| [`authentication_ldap_simple` authentication](/system-variables.md#default_authentication_plugin) | Y | Y | N | N | N | N | N | N | N | N | +| [Password management](/password-management.md) | Y | Y | Y | N | N | N | N | N | N | N | +| [MySQL compatible `GRANT` system](/privilege-management.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Dynamic Privileges](/privilege-management.md#dynamic-privileges) | Y | Y | Y | Y | Y | Y | Y | Y | N | N | +| [Security Enhanced Mode](/system-variables.md#tidb_enable_enhanced_security) | Y | Y | Y | Y | Y | Y | Y | Y | N | N | +| [Redacted Log Files](/log-redaction.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | ## Data import and export -| Data import and export | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| [Fast Importer (TiDB Lightning)](/tidb-lightning/tidb-lightning-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| mydumper logical dumper | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | -| [Dumpling logical dumper](/dumpling-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Transactional `LOAD DATA`](/sql-statements/sql-statement-load-data.md) | Y [^5] | Y | Y | Y | Y | Y | Y | Y | N [^6] | -| [Database migration toolkit (DM)](/migration-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Change data capture (CDC)](/ticdc/ticdc-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Stream data to Amazon S3, GCS, Azure Blob Storage, and NFS through TiCDC](/ticdc/ticdc-sink-to-cloud-storage.md) | Y | E | N | N | N | N | N | N | N | -| [TiCDC supports bidirectional replication between two TiDB clusters](/ticdc/ticdc-bidirectional-replication.md) | Y | Y | N | N | N | N | N | N | N | -| [TiCDC OpenAPI v2](/ticdc/ticdc-open-api-v2.md) | Y | N | N | N | N | N | N | N | N | +| Data import and export | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| [Fast import using TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Fast import using the `IMPORT INTO` statement](/sql-statements/sql-statement-import-into.md) | E | N | N | N | N | N | N | N | N | N | +| mydumper logical dumper | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | Deprecated | +| [Dumpling logical dumper](/dumpling-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Transactional `LOAD DATA`](/sql-statements/sql-statement-load-data.md) [^5] | Y | Y | Y | Y | Y | Y | Y | Y | Y | N [^6] | +| [Database migration toolkit (DM)](/migration-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Change data capture (CDC)](/ticdc/ticdc-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Stream data to Amazon S3, GCS, Azure Blob Storage, and NFS through TiCDC](/ticdc/ticdc-sink-to-cloud-storage.md) | Y | Y | E | N | N | N | N | N | N | N | +| [TiCDC supports bidirectional replication between two TiDB clusters](/ticdc/ticdc-bidirectional-replication.md) | Y | Y | Y | N | N | N | N | N | N | N | +| [TiCDC OpenAPI v2](/ticdc/ticdc-open-api-v2.md) | Y | Y | N | N | N | N | N | N | N | N | ## Management, observability, and tools -| Management, observability, and tools | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | -|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| -| [TiDB Dashboard UI](/dashboard/dashboard-intro.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [TiDB Dashboard Continuous Profiling](/dashboard/continuous-profiling.md) | Y | Y | Y | E | E | N | N | N | N | -| [TiDB Dashboard Top SQL](/dashboard/top-sql.md) | Y | Y | Y | E | N | N | N | N | N | -| [TiDB Dashboard SQL Diagnostics](/information-schema/information-schema-sql-diagnostics.md) | Y | Y | E | E | E | E | E | E | E | -| [TiDB Dashboard Cluster Diagnostics](/dashboard/dashboard-diagnostics-access.md) | Y | Y | E | E | E | E | E | E | E | -| [TiKV-FastTune dashboard](/grafana-tikv-dashboard.md#tikv-fasttune-dashboard) | E | E | E | E | E | E | E | E | E | -| [Information schema](/information-schema/information-schema.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Metrics schema](/metrics-schema.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Statements summary tables](/statement-summary-tables.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Statements summary tables - summary persistence](/statement-summary-tables.md#persist-statements-summary) | E | N | N | N | N | N | N | N | N | -| [Slow query log](/identify-slow-queries.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [TiUP deployment](/tiup/tiup-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Kubernetes operator](https://docs.pingcap.com/tidb-in-kubernetes/) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Built-in physical backup](/br/backup-and-restore-use-cases.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [Global Kill](/sql-statements/sql-statement-kill.md) | Y | Y | Y | E | E | E | E | E | E | -| [Lock View](/information-schema/information-schema-data-lock-waits.md) | Y | Y | Y | Y | Y | Y | E | E | E | -| [`SHOW CONFIG`](/sql-statements/sql-statement-show-config.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [`SET CONFIG`](/dynamic-config.md) | Y | Y | Y | E | E | E | E | E | E | -| [DM WebUI](/dm/dm-webui-guide.md) | E | E | E | N | N | N | N | N | N | -| [Foreground Quota Limiter](/tikv-configuration-file.md#foreground-quota-limiter) | Y | Y | E | N | N | N | N | N | N | -| [Background Quota Limiter](/tikv-configuration-file.md#background-quota-limiter) | E | E | N | N | N | N | N | N | N | -| [EBS volume snapshot backup and restore](https://docs.pingcap.com/tidb-in-kubernetes/v1.4/backup-to-aws-s3-by-snapshot) | Y | Y | N | N | N | N | N | N | N | -| [PITR](/br/backup-and-restore-overview.md) | Y | Y | N | N | N | N | N | N | N | -| [Global memory control](/configure-memory-usage.md#configure-the-memory-usage-threshold-of-a-tidb-server-instance) | Y | Y | N | N | N | N | N | N | N | -| [Cross-cluster RawKV replication](/tikv-configuration-file.md#api-version-new-in-v610) | E | E | N | N | N | N | N | N | N | -| [Green GC](/system-variables.md#tidb_gc_scan_lock_mode-new-in-v50) | E | E | E | E | E | E | E | E | N | -| [Resource control](/tidb-resource-control.md) | Y | N | N | N | N | N | N | N | N | -| [TiFlash Disaggregated Storage and Compute Architecture and S3 Support](/tiflash/tiflash-disaggregated-and-s3.md) | E | N | N | N | N | N | N | N | N | +| Management, observability, and tools | 7.2 | 7.1 | 6.5 | 6.1 | 5.4 | 5.3 | 5.2 | 5.1 | 5.0 | 4.0 | +|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| +| [TiDB Dashboard UI](/dashboard/dashboard-intro.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [TiDB Dashboard Continuous Profiling](/dashboard/continuous-profiling.md) | Y | Y | Y | Y | E | E | N | N | N | N | +| [TiDB Dashboard Top SQL](/dashboard/top-sql.md) | Y | Y | Y | Y | E | N | N | N | N | N | +| [TiDB Dashboard SQL Diagnostics](/information-schema/information-schema-sql-diagnostics.md) | Y | Y | Y | E | E | E | E | E | E | E | +| [TiDB Dashboard Cluster Diagnostics](/dashboard/dashboard-diagnostics-access.md) | Y | Y | Y | E | E | E | E | E | E | E | +| [TiKV-FastTune dashboard](/grafana-tikv-dashboard.md#tikv-fasttune-dashboard) | E | E | E | E | E | E | E | E | E | E | +| [Information schema](/information-schema/information-schema.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Metrics schema](/metrics-schema.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Statements summary tables](/statement-summary-tables.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Statements summary tables - summary persistence](/statement-summary-tables.md#persist-statements-summary) | E | E | N | N | N | N | N | N | N | N | +| [Slow query log](/identify-slow-queries.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [TiUP deployment](/tiup/tiup-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Kubernetes operator](https://docs.pingcap.com/tidb-in-kubernetes/) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Built-in physical backup](/br/backup-and-restore-use-cases.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [Global Kill](/sql-statements/sql-statement-kill.md) | Y | Y | Y | Y | E | E | E | E | E | E | +| [Lock View](/information-schema/information-schema-data-lock-waits.md) | Y | Y | Y | Y | Y | Y | Y | E | E | E | +| [`SHOW CONFIG`](/sql-statements/sql-statement-show-config.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [`SET CONFIG`](/dynamic-config.md) | Y | Y | Y | Y | E | E | E | E | E | E | +| [DM WebUI](/dm/dm-webui-guide.md) | E | E | E | E | N | N | N | N | N | N | +| [Foreground Quota Limiter](/tikv-configuration-file.md#foreground-quota-limiter) | Y | Y | Y | E | N | N | N | N | N | N | +| [Background Quota Limiter](/tikv-configuration-file.md#background-quota-limiter) | E | E | E | N | N | N | N | N | N | N | +| [EBS volume snapshot backup and restore](https://docs.pingcap.com/tidb-in-kubernetes/v1.4/backup-to-aws-s3-by-snapshot) | Y | Y | Y | N | N | N | N | N | N | N | +| [PITR](/br/backup-and-restore-overview.md) | Y | Y | Y | N | N | N | N | N | N | N | +| [Global memory control](/configure-memory-usage.md#configure-the-memory-usage-threshold-of-a-tidb-server-instance) | Y | Y | Y | N | N | N | N | N | N | N | +| [Cross-cluster RawKV replication](/tikv-configuration-file.md#api-version-new-in-v610) | E | E | E | N | N | N | N | N | N | N | +| [Green GC](/system-variables.md#tidb_gc_scan_lock_mode-new-in-v50) | E | E | E | E | E | E | E | E | E | N | +| [Resource control](/tidb-resource-control.md) | Y | Y | N | N | N | N | N | N | N | N | +| [Runaway Queries Management](/tidb-resource-control.md#manage-queries-that-consume-more-resources-than-expected-runaway-queries) | E | N | N | N | N | N | N | N | N | N | +| [TiFlash Disaggregated Storage and Compute Architecture and S3 Support](/tiflash/tiflash-disaggregated-and-s3.md) | E | E | N | N | N | N | N | N | N | N | [^1]: TiDB incorrectly treats latin1 as a subset of utf8. See [TiDB #18955](https://github.com/pingcap/tidb/issues/18955) for more details. @@ -247,6 +252,6 @@ You can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?u [^4]: Starting from [v6.4.0](/releases/release-6.4.0.md), TiDB supports [high-performance and globally monotonic `AUTO_INCREMENT` columns](/auto-increment.md#mysql-compatibility-mode) -[^5]: For [TiDB v7.0.0](/releases/release-7.0.0.md), the new parameter `FIELDS DEFINED NULL BY` and support for importing data from S3 and GCS are experimental features. +[^5]: Starting from [TiDB v7.0.0](/releases/release-7.0.0.md), the new parameter `FIELDS DEFINED NULL BY` and support for importing data from S3 and GCS are experimental features. [^6]: For TiDB v4.0, the `LOAD DATA` transaction does not guarantee atomicity. From 769ea6da9cdf7db553b7ce00d919d28ecb025f05 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Thu, 29 Jun 2023 14:09:13 +0800 Subject: [PATCH 14/30] Add v7.2.0 release notes (#13910) --- TOC.md | 4 +- releases/release-7.2.0.md | 328 +++++++++++++++++++++++++++++++++++ releases/release-notes.md | 4 + releases/release-timeline.md | 1 + upgrade-tidb-using-tiup.md | 2 +- 5 files changed, 337 insertions(+), 2 deletions(-) create mode 100644 releases/release-7.2.0.md diff --git a/TOC.md b/TOC.md index 38551e75bd240..845109f7fdeba 100644 --- a/TOC.md +++ b/TOC.md @@ -4,7 +4,7 @@ - [Docs Home](https://docs.pingcap.com/) - About TiDB - [TiDB Introduction](/overview.md) - - [TiDB 7.1 Release Notes](/releases/release-7.1.0.md) + - [TiDB 7.2 Release Notes](/releases/release-7.2.0.md) - [Features](/basic-features.md) - [MySQL Compatibility](/mysql-compatibility.md) - [TiDB Limitations](/tidb-limitations.md) @@ -985,6 +985,8 @@ - [Release Timeline](/releases/release-timeline.md) - [TiDB Versioning](/releases/versioning.md) - [TiDB Installation Packages](/binary-package.md) + - v7.2 + - [7.2.0-DMR](/releases/release-7.2.0.md) - v7.1 - [7.1.0](/releases/release-7.1.0.md) - v7.0 diff --git a/releases/release-7.2.0.md b/releases/release-7.2.0.md new file mode 100644 index 0000000000000..15d25c97f51df --- /dev/null +++ b/releases/release-7.2.0.md @@ -0,0 +1,328 @@ +--- +title: TiDB 7.2.0 Release Notes +summary: Learn about the new features, compatibility changes, improvements, and bug fixes in TiDB 7.2.0. +--- + +# TiDB 7.2.0 Release Notes + +Release date: June 29, 2023 + +TiDB version: 7.2.0 + +Quick access: [Quick start](https://docs.pingcap.com/tidb/v7.2/quick-start-with-tidb) | [Installation packages](https://www.pingcap.com/download/?version=v7.2.0#version-list) + +7.2.0 introduces the following key features and improvements: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CategoryFeatureDescription
Scalability and PerformanceResource groups support managing runaway queries (experimental)You can now manage query timeout with more granularity, allowing for different behaviors based on query classifications. Queries meeting your specified threshold can be deprioritized or terminated. +
TiFlash supports the pipeline execution model (experimental)TiFlash supports a pipeline execution model to optimize thread resource control.
SQLSupport a new SQL statement, IMPORT INTO, to enable data import using the TiDB service, itself (experimental)To simplify the deployment and maintenance of TiDB Lightning, TiDB introduces a new SQL statement IMPORT INTO, which integrates physical import mode of TiDB Lightning, including remote import from Amazon S3 or Google Cloud Storage (GCS) directly into TiDB.
DB Operations and ObservabilityDDL supports pause and resume operations (experimental)This new capability lets you temporarily suspend resource-intensive DDL operations, such as index creation, to conserve resources and minimize the impact on online traffic. You can seamlessly resume these operations when ready, without the need to cancel and restart. This feature enhances resource utilization, improves user experience, and streamlines schema changes.
+ +## Feature details + +### Performance + +* Support pushing down the following two [window functions](/tiflash/tiflash-supported-pushdown-calculations.md) to TiFlash [#7427](https://github.com/pingcap/tiflash/issues/7427) @[xzhangxian1008](https://github.com/xzhangxian1008) + + * `FIRST_VALUE` + * `LAST_VALUE` + +* TiFlash supports the pipeline execution model (experimental) [#6518](https://github.com/pingcap/tiflash/issues/6518) @[SeaRise](https://github.com/SeaRise) + + Prior to v7.2.0, each task in the TiFlash engine must individually request thread resources during execution. TiFlash controls the number of tasks to limit thread resource usage and prevent overuse, but this issue could not be completely eliminated. To address this problem, starting from v7.2.0, TiFlash introduces a pipeline execution model. This model centrally manages all thread resources and schedules task execution uniformly, maximizing the utilization of thread resources while avoiding resource overuse. To enable or disable the pipeline execution model, modify the [`tidb_enable_tiflash_pipeline_model`](/system-variables.md#tidb_enable_tiflash_pipeline_model-new-in-v720) system variable. + + For more information, see [documentation](/tiflash/tiflash-pipeline-model.md). + +* TiFlash reduces the latency of schema replication [#7630](https://github.com/pingcap/tiflash/issues/7630) @[hongyunyan](https://github.com/hongyunyan) + + When the schema of a table changes, TiFlash needs to replicate the latest schema from TiKV in a timely manner. Before v7.2.0, when TiFlash accesses table data and detects a table schema change within a database, TiFlash needs to replicate the schemas of all tables in this database again, including those tables without TiFlash replicas. As a result, in a database with a large number of tables, even if you only need to read data from a single table using TiFlash, you might experience significant latency to wait for TiFlash to complete the schema replication of all tables. + + In v7.2.0, TiFlash optimizes the schema replication mechanism and supports only replicating schemas of tables with TiFlash replicas. When a schema change is detected for a table with TiFlash replicas, TiFlash only replicates the schema of that table, which reduces the latency of schema replication of TiFlash and minimizes the impact of DDL operations on TiFlash data replication. This optimization is automatically applied and does not require any manual configuration. + +* Improve the performance of statistics collection [#44725](https://github.com/pingcap/tidb/issues/44725) @[xuyifangreeneyes](https://github.com/xuyifangreeneyes) + + TiDB v7.2.0 optimizes the statistics collection strategy, skipping some of the duplicate information and information that is of little value to the optimizer. The overall speed of statistics collection has been improved by 30%. This improvement allows TiDB to update the statistics of the database in a more timely manner, making the generated execution plans more accurate, thus improving the overall database performance. + + By default, statistics collection skips the columns of the `JSON`, `BLOB`, `MEDIUMBLOB`, and `LONGBLOB` types. You can modify the default behavior by setting the [`tidb_analyze_skip_column_types`](/system-variables.md#tidb_analyze_skip_column_types-new-in-v720) system variable. TiDB supports skipping the `JSON`, `BLOB`, and `TEXT` types and their subtypes. + + For more information, see [documentation](/system-variables.md#tidb_analyze_skip_column_types-new-in-v720). + +* Improve the performance of checking data and index consistency [#43693](https://github.com/pingcap/tidb/issues/43693) @[wjhuang2016](https://github.com/wjhuang2016) + + The [`ADMIN CHECK [TABLE|INDEX]`](/sql-statements/sql-statement-admin-check-table-index.md) statement is used to check the consistency between data in a table and its corresponding indexes. In v7.2.0, TiDB optimizes the method for checking data consistency and improves the execution efficiency of [`ADMIN CHECK [TABLE|INDEX]`](/sql-statements/sql-statement-admin-check-table-index.md) greatly. In scenarios with large amounts of data, this optimization can provide a performance boost of hundreds of times. + + The optimization is enabled by default ([`tidb_enable_fast_table_check`](/system-variables.md#tidb_enable_fast_table_check-new-in-v720) is `ON` by default) to greatly reduce the time required for data consistency checks in large-scale tables and enhance operational efficiency. + + For more information, see [documentation](/system-variables.md#tidb_enable_fast_table_check-new-in-v720). + +### Reliability + +* Automatically manage queries that consume more resources than expected (experimental) [#43691](https://github.com/pingcap/tidb/issues/43691) @[Connor1996](https://github.com/Connor1996) @[CabinfeverB](https://github.com/CabinfeverB) @[glorv](https://github.com/glorv) @[HuSharp](https://github.com/HuSharp) @[nolouch](https://github.com/nolouch) + + The most common challenge to database stability is the degradation of overall database performance caused by abrupt SQL performance problems. There are many causes for SQL performance issues, such as new SQL statements that have not been fully tested, drastic changes in data volume, and abrupt changes in execution plans. These issues are difficult to completely avoid at the root. TiDB v7.2.0 provides the ability to manage queries that consume more resources than expected. This feature can quickly reduce the scope of impact when a performance issue occurs. + + To manage these queries, you can set the maximum execution time of queries for a resource group. When the execution time of a query exceeds this limit, the query is automatically deprioritized or cancelled. You can also set a period of time to immediately match identified queries by text or execution plan. This helps prevent high concurrency of the problematic queries during the identification phase that could consume more resources than expected. + + Automatic management of queries that consume more resources than expected provides you with an effective means to quickly respond to unexpected query performance problems. This feature can reduce the impact of the problem on overall database performance, thereby improving database stability. + + For more information, see [documentation](/tidb-resource-control.md#manage-queries-that-consume-more-resources-than-expected-runaway-queries). + +* Enhance the capability of creating a binding according to a historical execution plan [#39199](https://github.com/pingcap/tidb/issues/39199) @[qw4990](https://github.com/qw4990) + + TiDB v7.2.0 enhances the capability of [creating a binding according to a historical execution plan](/sql-plan-management.md#create-a-binding-according-to-a-historical-execution-plan). This feature improves the parsing and binding process for complex statements, making the bindings more stable, and supports the following new hints: + + - [`AGG_TO_COP()`](/optimizer-hints.md#agg_to_cop) + - [`LIMIT_TO_COP()`](/optimizer-hints.md#limit_to_cop) + - [`ORDER_INDEX`](/optimizer-hints.md#order_indext1_name-idx1_name--idx2_name-) + - [`NO_ORDER_INDEX()`](/optimizer-hints.md#no_order_indext1_name-idx1_name--idx2_name-) + + For more information, see [documentation](/sql-plan-management.md). + +* Introduce the Optimizer Fix Controls mechanism to provide fine-grained control over optimizer behaviors [#43169](https://github.com/pingcap/tidb/issues/43169) @[time-and-fate](https://github.com/time-and-fate) + + To generate more reasonable execution plans, the behavior of the TiDB optimizer evolves over product iterations. However, in some particular scenarios, the changes might lead to performance regression. TiDB v7.2.0 introduces Optimizer Fix Controls to let you control some of the fine-grained behaviors of the optimizer. This enables you to roll back or control some new changes. + + Each controllable behavior is described by a GitHub issue corresponding to the fix number. All controllable behaviors are listed in [Optimizer Fix Controls](/optimizer-fix-controls.md). You can set a target value for one or more behaviors by setting the [`tidb_opt_fix_control`](/system-variables.md#tidb_opt_fix_control-new-in-v710) system variable to achieve behavior control. + + The Optimizer Fix Controls mechanism helps you control the TiDB optimizer at a granular level. It provides a new means of fixing performance issues caused by the upgrade process and improves the stability of TiDB. + + For more information, see [documentation](/optimizer-fix-controls.md). + +* Lightweight statistics initialization becomes generally available (GA) [#42160](https://github.com/pingcap/tidb/issues/42160) @[xuyifangreeneyes](https://github.com/xuyifangreeneyes) + + Starting from v7.2.0, the lightweight statistics initialization feature becomes GA. Lightweight statistics initialization can significantly reduce the number of statistics that must be loaded during startup, thus improving the speed of loading statistics. This feature increases the stability of TiDB in complex runtime environments and reduces the impact on the overall service when TiDB nodes restart. + + For newly created clusters of v7.2.0 or later versions, TiDB loads lightweight statistics by default during TiDB startup and will wait for the loading to finish before providing services. For clusters upgraded from earlier versions, you can set the TiDB configuration items [`lite-init-stats`](/tidb-configuration-file.md#lite-init-stats-new-in-v710) and [`force-init-stats`](/tidb-configuration-file.md#force-init-stats-new-in-v710) to `true` to enable this feature. + + For more information, see [documentation](/statistics.md#load-statistics). + +### SQL + +* Support the `CHECK` constraints [#41711](https://github.com/pingcap/tidb/issues/41711) @[fzzf678](https://github.com/fzzf678) + + Starting from v7.2.0, you can use `CHECK` constraints to restrict the values of one or more columns in a table to meet your specified conditions. When a `CHECK` constraint is added to a table, TiDB checks whether the constraint is satisfied before inserting or updating data in the table. Only the data that satisfies the constraint can be written. + + This feature is disabled by default. You can set the [`tidb_enable_check_constraint`](/system-variables.md#tidb_enable_check_constraint-new-in-v720) system variable to `ON` to enable it. + + For more information, see [documentation](/constraints.md#check). + +### DB operations + +* DDL jobs support pause and resume operations (experimental) [#18015](https://github.com/pingcap/tidb/issues/18015) @[godouxm](https://github.com/godouxm) + + Before TiDB v7.2.0, when a DDL job encounters a business peak during execution, you can only manually cancel the DDL job to reduce its impact on the business. In v7.2.0, TiDB introduces pause and resume operations for DDL jobs. These operations let you pause DDL jobs during a peak and resume them after the peak ends, thus avoiding impact on your application workloads. + + For example, you can pause and resume multiple DDL jobs using `ADMIN PAUSE DDL JOBS` or `ADMIN RESUME DDL JOBS`: + + ```sql + ADMIN PAUSE DDL JOBS 1,2; + ADMIN RESUME DDL JOBS 1,2; + ``` + + For more information, see [documentation](/ddl-introduction.md#ddl-related-commands). + +### Data migration + +* Introduce a new SQL statement `IMPORT INTO` to improve data import efficiency greatly (experimental) [#42930](https://github.com/pingcap/tidb/issues/42930) @[D3Hunter](https://github.com/D3Hunter) + + The `IMPORT INTO` statement integrates the [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) capability of TiDB Lightning. With this statement, you can quickly import data in formats such as CSV, SQL, and PARQUET into an empty table in TiDB. This import method eliminates the need for a separate deployment and management of TiDB Lightning, thereby reducing the complexity of data import and greatly improving import efficiency. + + For data files stored in Amazon S3 or GCS, when the [Backend task distributed execution framework](/tidb-distributed-execution-framework.md) is enabled, `IMPORT INTO` also supports splitting a data import job into multiple sub-jobs and scheduling them to multiple TiDB nodes for parallel import, which further enhances import performance. + + For more information, see [documentation](/sql-statements/sql-statement-import-into.md). + +* TiDB Lightning supports importing source files with the Latin-1 character set into TiDB [#44434](https://github.com/pingcap/tidb/issues/44434) @[lance6716](https://github.com/lance6716) + + With this feature, you can directly import source files with the Latin-1 character set into TiDB using TiDB Lightning. Before v7.2.0, importing such files requires your additional preprocessing or conversion. Starting from v7.2.0, you only need to specify `character-set = "latin1"` when configuring the TiDB Lightning import task. Then, TiDB Lightning automatically handles the character set conversion during the import process to ensure data integrity and accuracy. + + For more information, see [documentation](/tidb-lightning/tidb-lightning-configuration.md#tidb-lightning-task). + +## Compatibility changes + +> **Note:** +> +> This section provides compatibility changes you need to know when you upgrade from v7.1.0 to the current version (v7.2.0). If you are upgrading from v7.0.0 or earlier versions to the current version, you might also need to check the compatibility changes introduced in intermediate versions. + +### System variables + +| Variable name | Change type | Description | +|--------|------------------------------|------| +| [`last_insert_id`](/system-variables.md#last_insert_id) | Modified | Changes the maximum value from `9223372036854775807` to `18446744073709551615` to be consistent with that of MySQL. | +| [`tidb_enable_non_prepared_plan_cache`](/system-variables.md#tidb_enable_non_prepared_plan_cache) | Modified | Changes the default value from `OFF` to `ON` after further tests, meaning that non-prepared execution plan cache is enabled. | +| [`tidb_remove_orderby_in_subquery`](/system-variables.md#tidb_remove_orderby_in_subquery-new-in-v610) | Modified | Changes the default value from `OFF` to `ON` after further tests, meaning that the optimizer removes the `ORDER BY` clause in a subquery. | +| [`tidb_analyze_skip_column_types`](/system-variables.md#tidb_analyze_skip_column_types-new-in-v720) | Newly added | Controls which types of columns are skipped for statistics collection when executing the `ANALYZE` command to collect statistics. The variable is only applicable for [`tidb_analyze_version = 2`](/system-variables.md#tidb_analyze_version-new-in-v510). When using the syntax of `ANALYZE TABLE t COLUMNS c1, ..., cn`, if the type of a specified column is included in `tidb_analyze_skip_column_types`, the statistics of this column will not be collected. | +| [`tidb_enable_check_constraint`](/system-variables.md#tidb_enable_check_constraint-new-in-v720) | Newly added | Controls whether to enable `CHECK` constraints. The default value is `OFF`, which means this feature is disabled. | +| [`tidb_enable_fast_table_check`](/system-variables.md#tidb_enable_fast_table_check-new-in-v720) | Newly added | Controls whether to use a checksum-based approach to quickly check the consistency of data and indexes in a table. The default value is `ON`, which means this feature is enabled. | +| [`tidb_enable_tiflash_pipeline_model`](/system-variables.md#tidb_enable_tiflash_pipeline_model-new-in-v720) | Newly added | Controls whether to enable the new execution model of TiFlash, the [pipeline model](/tiflash/tiflash-pipeline-model.md). The default value is `OFF`, which means the pipeline model is disabled. | +| [`tidb_expensive_txn_time_threshold`](/system-variables.md#tidb_expensive_txn_time_threshold-new-in-v720) | Newly added | Controls the threshold for logging expensive transactions, which is 600 seconds by default. When the duration of a transaction exceeds the threshold, and the transaction is neither committed nor rolled back, it is considered an expensive transaction and will be logged. | + +### Configuration file parameters + +| Configuration file | Configuration parameter | Change type | Description | +| -------- | -------- | -------- | -------- | +| TiDB | [`lite-init-stats`](/tidb-configuration-file.md#lite-init-stats-new-in-v710) | Modified | Changes the default value from `false` to `true` after further tests, meaning that TiDB uses lightweight statistics initialization by default during TiDB startup to improve the initialization efficiency. | +| TiDB | [`force-init-stats`](/tidb-configuration-file.md#force-init-stats-new-in-v710) | Modified | Changes the default value from `false` to `true` to align with [`lite-init-stats`](/tidb-configuration-file.md#lite-init-stats-new-in-v710), meaning that TiDB waits for statistics initialization to finish before providing services during TiDB startup. | +| TiKV | [rocksdb.\[defaultcf\|writecf\|lockcf\].compaction-guard-min-output-file-size](/tikv-configuration-file.md#compaction-guard-min-output-file-size) | Modified | Changes the default value from `"8MB"` to `"1MB"` to reduce the data volume of compaction tasks in RocksDB. | +| TiKV | [rocksdb.\[defaultcf\|writecf\|lockcf\].optimize-filters-for-memory](/tikv-configuration-file.md#optimize-filters-for-memory-new-in-v720) | Newly added | Controls whether to generate Bloom/Ribbon filters that minimize memory internal fragmentation. | +| TiKV | [rocksdb.\[defaultcf\|writecf\|lockcf\].periodic-compaction-seconds](/tikv-configuration-file.md#periodic-compaction-seconds-new-in-v720) | Newly added | Controls the time interval for periodic compaction. SST files with updates older than this value will be selected for compaction and rewritten to the same level where these SST files originally reside. | +| TiKV | [rocksdb.\[defaultcf\|writecf\|lockcf\].ribbon-filter-above-level](/tikv-configuration-file.md#ribbon-filter-above-level-new-in-v720) | Newly added | Controls whether to use Ribbon filters for levels greater than or equal to this value and use non-block-based bloom filters for levels less than this value. | +| TiKV | [rocksdb.\[defaultcf\|writecf\|lockcf\].ttl](/tikv-configuration-file.md#ttl-new-in-v720) | Newly added | SST files with updates older than the TTL will be automatically selected for compaction. | +| TiDB Lightning | `send-kv-pairs` | Deprecated | Starting from v7.2.0, the parameter `send-kv-pairs` is deprecated. You can use [`send-kv-size`](/tidb-lightning/tidb-lightning-configuration.md) to control the maximum size of one request when sending data to TiKV in physical import mode. | +| TiDB Lightning | [`character-set`](/tidb-lightning/tidb-lightning-configuration.md#tidb-lightning-task) | Modified | Introduces a new value option `latin1` for the supported character sets of data import. You can use this option to import source files with the Latin-1 character set. | +| TiDB Lightning | [`send-kv-size`](/tidb-lightning/tidb-lightning-configuration.md) | Newly added | Specify the maximum size of one request when sending data to TiKV in physical import mode. When the size of key-value pairs reaches the specified threshold, TiDB Lightning will immediately send them to TiKV. This avoids the OOM problems caused by TiDB Lightning nodes accumulating too many key-value pairs in memory when importing large wide tables. By adjusting this parameter, you can find a balance between memory usage and import speed, improving the stability and efficiency of the import process. | +| Data Migration | [`strict-optimistic-shard-mode`](/dm/feature-shard-merge-optimistic.md) | Newly added | This configuration item is used to be compatible with the DDL shard merge behavior in TiDB Data Migration v2.0. You can enable this configuration item in optimistic mode. After this is enabled, the replication task will be interrupted when it encounters a Type 2 DDL statement. In scenarios where there are dependencies between DDL changes in multiple tables, a timely interruption can be made. You need to manually process the DDL statements of each table before resuming the replication task to ensure data consistency between the upstream and the downstream. | +| TiCDC | [`sink.protocol`](/ticdc/ticdc-changefeed-config.md) | Modified | Introduces a new value option `"open-protocol"` when the downstream is Kafka. Specifies the protocol format used for encoding messages. | +| TiCDC | [`sink.delete-only-output-handle-key-columns`](/ticdc/ticdc-changefeed-config.md) | Newly added | Specifies the output of DELETE events. This parameter is valid only for `"canal-json"` and `"open-protocol"` protocols. The default value is `false`, which means outputting all columns. When you set it to `true`, only primary key columns or unique index columns are output. | + +## Improvements + ++ TiDB + + - Optimize the logic of constructing index scan range so that it supports converting complex conditions into index scan range [#41572](https://github.com/pingcap/tidb/issues/41572) [#44389](https://github.com/pingcap/tidb/issues/44389) @[xuyifangreeneyes](https://github.com/xuyifangreeneyes) + - Add new monitoring metrics `Stale Read OPS` and `Stale Read Traffic` [#43325](https://github.com/pingcap/tidb/issues/43325) @[you06](https://github.com/you06) + - When the retry leader of stale read encounters a lock, TiDB forcibly retries with the leader after resolving the lock, which avoids unnecessary overhead [#43659](https://github.com/pingcap/tidb/issues/43659) @[you06](https://github.com/you06) + - Use estimated time to calculate stale read ts and reduce the overhead of stale read [#44215](https://github.com/pingcap/tidb/issues/44215) @[you06](https://github.com/you06) + - Add logs and system variables for long-running transactions [#41471](https://github.com/pingcap/tidb/issues/41471) @[crazycs520](https://github.com/crazycs520) + - Support connecting to TiDB through the compressed MySQL protocol, which improves the performance of data-intensive queries under low bandwidth networks and saves bandwidth costs. This supports both `zlib` and `zstd` based compression. [#22605](https://github.com/pingcap/tidb/issues/22605) @[dveeden](https://github.com/dveeden) + - Recognize both `utf8` and `utf8bm3` as the legacy three-byte UTF-8 character set encodings, which facilitates the migration of tables with legacy UTF-8 encodings from MySQL 8.0 to TiDB [#26226](https://github.com/pingcap/tidb/issues/26226) @[dveeden](https://github.com/dveeden) + - Support using `:=` for assignment in `UPDATE` statements [#44751](https://github.com/pingcap/tidb/issues/44751) @[CbcWestwolf](https://github.com/CbcWestwolf) + ++ TiKV + + - Support configuring the retry interval of PD connections in scenarios such as connection request failures using `pd.retry-interval` [#14964](https://github.com/tikv/tikv/issues/14964) @[rleungx](https://github.com/rleungx) + - Optimize the resource control scheduling algorithm by incorporating the global resource usage [#14604](https://github.com/tikv/tikv/issues/14604) @[Connor1996](https://github.com/Connor1996) + - Use gzip compression for `check_leader` requests to reduce traffic [#14553](https://github.com/tikv/tikv/issues/14553) @[you06](https://github.com/you06) + - Add related metrics for `check_leader` requests [#14658](https://github.com/tikv/tikv/issues/14658) @[you06](https://github.com/you06) + - Provide detailed time information during TiKV handling write commands [#12362](https://github.com/tikv/tikv/issues/12362) @[cfzjywxk](https://github.com/cfzjywxk) + ++ PD + + - Use a separate gRPC connection for PD leader election to prevent the impact of other requests [#6403](https://github.com/tikv/pd/issues/6403) @[rleungx](https://github.com/rleungx) + - Enable the bucket splitting by default to mitigate hotspot issues in multi-Region scenarios [#6433](https://github.com/tikv/pd/issues/6433) @[bufferflies](https://github.com/bufferflies) + ++ Tools + + + Backup & Restore (BR) + + - Support access to Azure Blob Storage by shared access signature (SAS) [#44199](https://github.com/pingcap/tidb/issues/44199) @[Leavrth](https://github.com/Leavrth) + + + TiCDC + + - Optimize the structure of the directory where data files are stored when a DDL operation occurs in the scenario of replication to an object storage service [#8891](https://github.com/pingcap/tiflow/issues/8891) @[CharlesCheung96](https://github.com/CharlesCheung96) + - Support the OAUTHBEARER authentication in the scenario of replication to Kafka [#8865](https://github.com/pingcap/tiflow/issues/8865) @[hi-rustin](https://github.com/hi-rustin) + - Add the option of outputting only the handle keys for the `DELETE` operation in the scenario of replication to Kafka [#9143](https://github.com/pingcap/tiflow/issues/9143) @[3AceShowHand](https://github.com/3AceShowHand) + + + TiDB Data Migration (DM) + + - Support reading compressed binlogs in MySQL 8.0 as a data source for incremental replication [#6381](https://github.com/pingcap/tiflow/issues/6381) @[dveeden](https://github.com/dveeden) + + + TiDB Lightning + + - Optimize the retry mechanism during import to avoid errors caused by leader switching [#44478](https://github.com/pingcap/tidb/pull/44478) @[lance6716](https://github.com/lance6716) + - Verify checksum throught SQL after import to improve stability of verification [#41941](https://github.com/pingcap/tidb/issues/41941) @[GMHDBJD](https://github.com/GMHDBJD) + - Optimize TiDB Lightning OOM issues when importing wide tables [43853](https://github.com/pingcap/tidb/issues/43853) @[D3Hunter](https://github.com/D3Hunter) + +## Bug fixes + ++ TiDB + + - Fix the issue that the query with CTE causes TiDB to hang [#43749](https://github.com/pingcap/tidb/issues/43749) [#36896](https://github.com/pingcap/tidb/issues/36896) @[guo-shaoge](https://github.com/guo-shaoge) + - Fix the issue that the `min, max` query result is incorrect [#43805](https://github.com/pingcap/tidb/issues/43805) @[wshwsh12](https://github.com/wshwsh12) + - Fix the issue that the `SHOW PROCESSLIST` statement cannot display the TxnStart of the transaction of the statement with a long subquery time [#40851](https://github.com/pingcap/tidb/issues/40851) @[crazycs520](https://github.com/crazycs520) + - Fix the issue that the stale read global optimization does not take effect due to the lack of `TxnScope` in Coprocessor tasks [#43365](https://github.com/pingcap/tidb/issues/43365) @[you06](https://github.com/you06) + - Fix the issue that follower read does not handle flashback errors before retrying, which causes query errors [#43673](https://github.com/pingcap/tidb/issues/43673) @[you06](https://github.com/you06) + - Fix the issue that data and indexes are inconsistent when the `ON UPDATE` statement does not correctly update the primary key [#44565](https://github.com/pingcap/tidb/issues/44565) @[zyguan](https://github.com/zyguan) + - Modify the upper limit of the `UNIX_TIMESTAMP()` function to `3001-01-19 03:14:07.999999 UTC` to be consistent with that of MySQL 8.0.28 or later versions [#43987](https://github.com/pingcap/tidb/issues/43987) @[YangKeao](https://github.com/YangKeao) + - Fix the issue that adding an index fails in the ingest mode [#44137](https://github.com/pingcap/tidb/issues/44137) @[tangenta](https://github.com/tangenta) + - Fix the issue that canceling a DDL task in the rollback state causes errors in related metadata [#44143](https://github.com/pingcap/tidb/issues/44143) @[wjhuang2016](https://github.com/wjhuang2016) + - Fix the issue that using `memTracker` with cursor fetch causes memory leaks [#44254](https://github.com/pingcap/tidb/issues/44254) @[YangKeao](https://github.com/YangKeao) + - Fix the issue that dropping a database causes slow GC progress [#33069](https://github.com/pingcap/tidb/issues/33069) @[tiancaiamao](https://github.com/tiancaiamao) + - Fix the issue that TiDB returns an error when the corresponding rows in partitioned tables cannot be found in the probe phase of index join [#43686](https://github.com/pingcap/tidb/issues/43686) @[AilinKid](https://github.com/AilinKid) @[mjonss](https://github.com/mjonss) + - Fix the issue that there is no warning when using `SUBPARTITION` to create partitioned tables [#41198](https://github.com/pingcap/tidb/issues/41198) [#41200](https://github.com/pingcap/tidb/issues/41200) @[mjonss](https://github.com/mjonss) + - Fix the issue that when a query is killed because it exceeds `MAX_EXECUTION_TIME`, the returned error message is inconsistent with that of MySQL [#43031](https://github.com/pingcap/tidb/issues/43031) @[dveeden](https://github.com/dveeden) + - Fix the issue that the `LEADING` hint does not support querying block aliases [#44645](https://github.com/pingcap/tidb/issues/44645) @[qw4990](https://github.com/qw4990) + - Modify the return type of the `LAST_INSERT_ID()` function from VARCHAR to LONGLONG to be consistent with that of MySQL [#44574](https://github.com/pingcap/tidb/issues/44574) @[Defined2014](https://github.com/Defined2014) + - Fix the issue that incorrect results might be returned when using a common table expression (CTE) in statements with non-correlated subqueries [#44051](https://github.com/pingcap/tidb/issues/44051) @[winoros](https://github.com/winoros) + - Fix the issue that Join Reorder might cause incorrect outer join results [#44314](https://github.com/pingcap/tidb/issues/44314) @[AilinKid](https://github.com/AilinKid) + - Fix the issue that `PREPARE stmt FROM "ANALYZE TABLE xxx"` might be killed by `tidb_mem_quota_query` [#44320](https://github.com/pingcap/tidb/issues/44320) @[chrysan](https://github.com/chrysan) + ++ TiKV + + - Fix the issue that the transaction returns an incorrect value when TiKV handles stale pessimistic lock conflicts [#13298](https://github.com/tikv/tikv/issues/13298) @[cfzjywxk](https://github.com/cfzjywxk) + - Fix the issue that in-memory pessimistic lock might cause flashback failures and data inconsistency [#13303](https://github.com/tikv/tikv/issues/13303) @[JmPotato](https://github.com/JmPotato) + - Fix the issue that the fair lock might be incorrect when TiKV handles stale requests [#13298](https://github.com/tikv/tikv/issues/13298) @[cfzjywxk](https://github.com/cfzjywxk) + - Fix the issue that `autocommit` and `point get replica read` might break linearizability [#14715](https://github.com/tikv/tikv/issues/14715) @[cfzjywxk](https://github.com/cfzjywxk) + ++ PD + + - Fix the issue that redundant replicas cannot be automatically repaired in some corner cases [#6573](https://github.com/tikv/pd/issues/6573) @[nolouch](https://github.com/nolouch) + ++ TiFlash + + - Fix the issue that queries might consume more memory than needed when the data on the Join build side is very large and contains many small string type columns [#7416](https://github.com/pingcap/tiflash/issues/7416) @[yibin87](https://github.com/yibin87) + ++ Tools + + + Backup & Restore (BR) + + - Fix the issue that `checksum mismatch` is falsely reported in some cases [#44472](https://github.com/pingcap/tidb/issues/44472) @[Leavrth](https://github.com/Leavrth) + - Fix the issue that `resolved lock timeout` is falsely reported in some cases [#43236](https://github.com/pingcap/tidb/issues/43236) @[YuJuncen](https://github.com/YuJuncen) + - Fix the issue that TiDB might panic when restoring statistics information [#44490](https://github.com/pingcap/tidb/issues/44490) @[tangenta](https://github.com/tangenta) + + + TiCDC + + - Fix the issue that Resolved TS does not advance properly in some cases [#8963](https://github.com/pingcap/tiflow/issues/8963) @[CharlesCheung96](https://github.com/CharlesCheung96) + - Fix the issue that the `UPDATE` operation cannot output old values when the Avro or CSV protocol is used [#9086](https://github.com/pingcap/tiflow/issues/9086) @[3AceShowHand](https://github.com/3AceShowHand) + - Fix the issue of excessive downstream pressure caused by reading downstream metadata too frequently when replicating data to Kafka [#8959](https://github.com/pingcap/tiflow/issues/8959) @[hi-rustin](https://github.com/hi-rustin) + - Fix the issue of too many downstream logs caused by frequently setting the downstream bidirectional replication-related variables when replicating data to TiDB or MySQL [#9180](https://github.com/pingcap/tiflow/issues/9180) @[asddongmen](https://github.com/asddongmen) + - Fix the issue that the PD node crashing causes the TiCDC node to restart [#8868](https://github.com/pingcap/tiflow/issues/8868) @[asddongmen](https://github.com/asddongmen) + - Fix the issue that TiCDC cannot create a changefeed with a downstream Kafka-on-Pulsar [#8892](https://github.com/pingcap/tiflow/issues/8892) @[hi-rustin](https://github.com/hi-rustin) + + + TiDB Lightning + + - Fix the TiDB Lightning panic issue when `experimental.allow-expression-index` is enabled and the default value is UUID [#44497](https://github.com/pingcap/tidb/issues/44497) @[lichunzhu](https://github.com/lichunzhu) + - Fix the TiDB Lightning panic issue when a task exits while dividing a data file [#43195](https://github.com/pingcap/tidb/issues/43195) @[lance6716](https://github.com/lance6716) + +## Contributors + +We would like to thank the following contributors from the TiDB community: + +- [asjdf](https://github.com/asjdf) +- [blacktear23](https://github.com/blacktear23) +- [Cavan-xu](https://github.com/Cavan-xu) +- [darraes](https://github.com/darraes) +- [demoManito](https://github.com/demoManito) +- [dhysum](https://github.com/dhysum) +- [HappyUncle](https://github.com/HappyUncle) +- [jiyfhust](https://github.com/jiyfhust) +- [L-maple](https://github.com/L-maple) +- [nyurik](https://github.com/nyurik) +- [SeigeC](https://github.com/SeigeC) +- [tangjingyu97](https://github.com/tangjingyu97) \ No newline at end of file diff --git a/releases/release-notes.md b/releases/release-notes.md index c52538ebf6e5e..923f3cb2a0a4f 100644 --- a/releases/release-notes.md +++ b/releases/release-notes.md @@ -5,6 +5,10 @@ aliases: ['/docs/dev/releases/release-notes/','/docs/dev/releases/rn/'] # TiDB Release Notes +## 7.2 + +- [7.2.0-DMR](/releases/release-7.2.0.md): 2023-06-29 + ## 7.1 - [7.1.0](/releases/release-7.1.0.md): 2023-05-31 diff --git a/releases/release-timeline.md b/releases/release-timeline.md index 2639e79b8508f..4ccf13b17c04d 100644 --- a/releases/release-timeline.md +++ b/releases/release-timeline.md @@ -9,6 +9,7 @@ This document shows all the released TiDB versions in reverse chronological orde | Version | Release Date | | :--- | :--- | +| [7.2.0-DMR](/releases/release-7.2.0.md) | 2023-06-29 | | [6.5.3](/releases/release-6.5.3.md) | 2023-06-14 | | [7.1.0](/releases/release-7.1.0.md) | 2023-05-31 | | [6.5.2](/releases/release-6.5.2.md) | 2023-04-21 | diff --git a/upgrade-tidb-using-tiup.md b/upgrade-tidb-using-tiup.md index 4acd3e44a9412..700c2d4192100 100644 --- a/upgrade-tidb-using-tiup.md +++ b/upgrade-tidb-using-tiup.md @@ -45,7 +45,7 @@ This section introduces the preparation works needed before upgrading your TiDB ### Step 1: Review compatibility changes -Review [the compatibility changes](/releases/release-7.1.0.md#compatibility-changes) in TiDB v7.1.0 release notes. If any changes affect your upgrade, take actions accordingly. +Review [the compatibility changes](/releases/release-7.2.0.md#compatibility-changes) in TiDB v7.2.0 release notes. If any changes affect your upgrade, take actions accordingly. ### Step 2: Upgrade TiUP or TiUP offline mirror From 71b2bb2ef9447fed41543014bfdfd0bb8d31766c Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Thu, 29 Jun 2023 14:10:43 +0800 Subject: [PATCH 15/30] bump tidb and tiup version to 7.2.0 (#14012) --- br/backup-and-restore-use-cases.md | 6 +-- dm/maintain-dm-using-tiup.md | 2 +- dm/quick-start-create-task.md | 2 +- .../information-schema-tidb-servers-info.md | 2 +- pd-control.md | 2 +- post-installation-check.md | 2 +- production-deployment-using-tiup.md | 12 +++--- quick-start-with-tidb.md | 10 ++--- scale-tidb-using-tiup.md | 4 +- system-variables.md | 2 +- ticdc/deploy-ticdc.md | 4 +- ticdc/ticdc-changefeed-overview.md | 2 +- ticdc/ticdc-open-api-v2.md | 2 +- tidb-binlog/get-started-with-tidb-binlog.md | 2 +- tidb-monitoring-api.md | 2 +- tiflash/create-tiflash-replicas.md | 2 +- tiup/tiup-cluster.md | 40 +++++++++---------- tiup/tiup-component-cluster-deploy.md | 2 +- tiup/tiup-component-cluster-patch.md | 2 +- tiup/tiup-component-cluster-upgrade.md | 2 +- tiup/tiup-component-dm-upgrade.md | 2 +- tiup/tiup-component-management.md | 12 +++--- tiup/tiup-mirror.md | 6 +-- tiup/tiup-playground.md | 4 +- upgrade-tidb-using-tiup.md | 32 +++++++-------- 25 files changed, 80 insertions(+), 80 deletions(-) diff --git a/br/backup-and-restore-use-cases.md b/br/backup-and-restore-use-cases.md index 58cd2de87d87d..57e67d0a7dd81 100644 --- a/br/backup-and-restore-use-cases.md +++ b/br/backup-and-restore-use-cases.md @@ -17,7 +17,7 @@ With PITR, you can satisfy the preceding requirements. ## Deploy the TiDB cluster and BR -To use PITR, you need to deploy a TiDB cluster >= v6.2.0 and update BR to the same version as the TiDB cluster. This document uses v7.1.0 as an example. +To use PITR, you need to deploy a TiDB cluster >= v6.2.0 and update BR to the same version as the TiDB cluster. This document uses v7.2.0 as an example. The following table shows the recommended hardware resources for using PITR in a TiDB cluster. @@ -44,13 +44,13 @@ Install or upgrade BR using TiUP: - Install: ```shell - tiup install br:v7.1.0 + tiup install br:v7.2.0 ``` - Upgrade: ```shell - tiup update br:v7.1.0 + tiup update br:v7.2.0 ``` ## Configure backup storage (Amazon S3) diff --git a/dm/maintain-dm-using-tiup.md b/dm/maintain-dm-using-tiup.md index d010ed349fc85..1f08907ca0a41 100644 --- a/dm/maintain-dm-using-tiup.md +++ b/dm/maintain-dm-using-tiup.md @@ -390,7 +390,7 @@ All operations above performed on the cluster machine use the SSH client embedde Then you can use the `--native-ssh` command-line flag to enable the system-native command-line tool: -- Deploy a cluster: `tiup dm deploy --native-ssh`. Fill in the name of your cluster for ``, the DM version to be deployed (such as `v7.1.0`) for `` , and the topology file name for ``. +- Deploy a cluster: `tiup dm deploy --native-ssh`. Fill in the name of your cluster for ``, the DM version to be deployed (such as `v7.2.0`) for `` , and the topology file name for ``. - Start a cluster: `tiup dm start --native-ssh`. - Upgrade a cluster: `tiup dm upgrade ... --native-ssh` diff --git a/dm/quick-start-create-task.md b/dm/quick-start-create-task.md index 4294ba0b10ae8..9493e55f74a0c 100644 --- a/dm/quick-start-create-task.md +++ b/dm/quick-start-create-task.md @@ -74,7 +74,7 @@ To run a TiDB server, use the following command: {{< copyable "shell-regular" >}} ```bash -wget https://download.pingcap.org/tidb-community-server-v7.1.0-linux-amd64.tar.gz +wget https://download.pingcap.org/tidb-community-server-v7.2.0-linux-amd64.tar.gz tar -xzvf tidb-latest-linux-amd64.tar.gz mv tidb-latest-linux-amd64/bin/tidb-server ./ ./tidb-server diff --git a/information-schema/information-schema-tidb-servers-info.md b/information-schema/information-schema-tidb-servers-info.md index b7e5ae5a693e7..891ed288c46e7 100644 --- a/information-schema/information-schema-tidb-servers-info.md +++ b/information-schema/information-schema-tidb-servers-info.md @@ -46,7 +46,7 @@ The output is as follows: PORT: 4000 STATUS_PORT: 10080 LEASE: 45s - VERSION: 5.7.25-TiDB-v7.1.0 + VERSION: 5.7.25-TiDB-v7.2.0 GIT_HASH: 827d8ff2d22ac4c93ae1b841b79d468211e1d393 BINLOG_STATUS: Off LABELS: diff --git a/pd-control.md b/pd-control.md index 84d75a54c3efd..37c7344e442ce 100644 --- a/pd-control.md +++ b/pd-control.md @@ -29,7 +29,7 @@ To obtain `pd-ctl` of the latest version, download the TiDB server installation > **Note:** > -> `{version}` in the link indicates the version number of TiDB. For example, the download link for `v7.1.0` in the `amd64` architecture is `https://download.pingcap.org/tidb-community-server-v7.1.0-linux-amd64.tar.gz`. +> `{version}` in the link indicates the version number of TiDB. For example, the download link for `v7.2.0` in the `amd64` architecture is `https://download.pingcap.org/tidb-community-server-v7.2.0-linux-amd64.tar.gz`. ### Compile from source code diff --git a/post-installation-check.md b/post-installation-check.md index b33931e63e7e1..60de180354951 100644 --- a/post-installation-check.md +++ b/post-installation-check.md @@ -63,7 +63,7 @@ The following information indicates successful login: ```sql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 -Server version: 5.7.25-TiDB-v7.1.0 TiDB Server (Apache License 2.0) Community Edition, MySQL 5.7 compatible +Server version: 5.7.25-TiDB-v7.2.0 TiDB Server (Apache License 2.0) Community Edition, MySQL 5.7 compatible Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective diff --git a/production-deployment-using-tiup.md b/production-deployment-using-tiup.md index e9b9ec091c9a3..ea6e64a5e42f4 100644 --- a/production-deployment-using-tiup.md +++ b/production-deployment-using-tiup.md @@ -139,12 +139,12 @@ Method 2: Manually pack an offline component package using `tiup mirror clone`. If you want to adjust an existing offline mirror (such as adding a new version of a component), take the following steps: - 1. When pulling an offline mirror, you can get an incomplete offline mirror by specifying specific information via parameters, such as the component and version information. For example, you can pull an offline mirror that includes only the offline mirror of TiUP v1.11.3 and TiUP Cluster v1.11.3 by running the following command: + 1. When pulling an offline mirror, you can get an incomplete offline mirror by specifying specific information via parameters, such as the component and version information. For example, you can pull an offline mirror that includes only the offline mirror of TiUP v1.12.3 and TiUP Cluster v1.12.3 by running the following command: {{< copyable "shell-regular" >}} ```bash - tiup mirror clone tiup-custom-mirror-v1.11.3 --tiup v1.11.3 --cluster v1.11.3 + tiup mirror clone tiup-custom-mirror-v1.12.3 --tiup v1.12.3 --cluster v1.12.3 ``` If you only need the components for a particular platform, you can specify them using the `--os` or `--arch` parameters. @@ -176,10 +176,10 @@ Method 2: Manually pack an offline component package using `tiup mirror clone`. {{< copyable "shell-regular" >}} ```bash - tiup mirror merge tiup-custom-mirror-v1.11.3 + tiup mirror merge tiup-custom-mirror-v1.12.3 ``` - 5. When the above steps are completed, check the result by running the `tiup list` command. In this document's example, the outputs of both `tiup list tiup` and `tiup list cluster` show that the corresponding components of `v1.11.3` are available. + 5. When the above steps are completed, check the result by running the `tiup list` command. In this document's example, the outputs of both `tiup list tiup` and `tiup list cluster` show that the corresponding components of `v1.12.3` are available. #### Deploy the offline TiUP component @@ -334,13 +334,13 @@ Before you run the `deploy` command, use the `check` and `check --apply` command {{< copyable "shell-regular" >}} ```shell - tiup cluster deploy tidb-test v7.1.0 ./topology.yaml --user root [-p] [-i /home/root/.ssh/gcp_rsa] + tiup cluster deploy tidb-test v7.2.0 ./topology.yaml --user root [-p] [-i /home/root/.ssh/gcp_rsa] ``` In the `tiup cluster deploy` command above: - `tidb-test` is the name of the TiDB cluster to be deployed. -- `v7.1.0` is the version of the TiDB cluster to be deployed. You can see the latest supported versions by running `tiup list tidb`. +- `v7.2.0` is the version of the TiDB cluster to be deployed. You can see the latest supported versions by running `tiup list tidb`. - `topology.yaml` is the initialization configuration file. - `--user root` indicates logging into the target machine as the `root` user to complete the cluster deployment. The `root` user is expected to have `ssh` and `sudo` privileges to the target machine. Alternatively, you can use other users with `ssh` and `sudo` privileges to complete the deployment. - `[-i]` and `[-p]` are optional. If you have configured login to the target machine without password, these parameters are not required. If not, choose one of the two parameters. `[-i]` is the private key of the root user (or other users specified by `--user`) that has access to the target machine. `[-p]` is used to input the user password interactively. diff --git a/quick-start-with-tidb.md b/quick-start-with-tidb.md index 048a41ee9c4d9..52dc7064a6467 100644 --- a/quick-start-with-tidb.md +++ b/quick-start-with-tidb.md @@ -81,10 +81,10 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in {{< copyable "shell-regular" >}} ```shell - tiup playground v7.1.0 --db 2 --pd 3 --kv 3 + tiup playground v7.2.0 --db 2 --pd 3 --kv 3 ``` - The command downloads a version cluster to the local machine and starts it, such as v7.1.0. To view the latest version, run `tiup list tidb`. + The command downloads a version cluster to the local machine and starts it, such as v7.2.0. To view the latest version, run `tiup list tidb`. This command returns the access methods of the cluster: @@ -202,10 +202,10 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in {{< copyable "shell-regular" >}} ```shell - tiup playground v7.1.0 --db 2 --pd 3 --kv 3 + tiup playground v7.2.0 --db 2 --pd 3 --kv 3 ``` - The command downloads a version cluster to the local machine and starts it, such as v7.1.0. To view the latest version, run `tiup list tidb`. + The command downloads a version cluster to the local machine and starts it, such as v7.2.0. To view the latest version, run `tiup list tidb`. This command returns the access methods of the cluster: @@ -437,7 +437,7 @@ Other requirements for the target machine: ``` - ``: Set the cluster name - - ``: Set the TiDB cluster version, such as `v7.1.0`. You can see all the supported TiDB versions by running the `tiup list tidb` command + - ``: Set the TiDB cluster version, such as `v7.2.0`. You can see all the supported TiDB versions by running the `tiup list tidb` command - `-p`: Specify the password used to connect to the target machine. > **Note:** diff --git a/scale-tidb-using-tiup.md b/scale-tidb-using-tiup.md index ff28a0b93c6f6..c8f1d25656fae 100644 --- a/scale-tidb-using-tiup.md +++ b/scale-tidb-using-tiup.md @@ -274,9 +274,9 @@ This section exemplifies how to remove a TiKV node from the `10.0.1.5` host. ``` ``` - Starting /root/.tiup/components/cluster/v1.11.3/cluster display + Starting /root/.tiup/components/cluster/v1.12.3/cluster display TiDB Cluster: - TiDB Version: v7.1.0 + TiDB Version: v7.2.0 ID Role Host Ports Status Data Dir Deploy Dir -- ---- ---- ----- ------ -------- ---------- 10.0.1.3:8300 cdc 10.0.1.3 8300 Up data/cdc-8300 deploy/cdc-8300 diff --git a/system-variables.md b/system-variables.md index 05b788e8dbda4..b489f9bd7169d 100644 --- a/system-variables.md +++ b/system-variables.md @@ -5201,7 +5201,7 @@ Internally, the TiDB parser transforms the `SET TRANSACTION ISOLATION LEVEL [REA - Scope: NONE - Default value: `5.7.25-TiDB-`(tidb version) -- This variable returns the MySQL version, followed by the TiDB version. For example '5.7.25-TiDB-v7.1.0'. +- This variable returns the MySQL version, followed by the TiDB version. For example '5.7.25-TiDB-v7.2.0'. ### version_comment diff --git a/ticdc/deploy-ticdc.md b/ticdc/deploy-ticdc.md index b461b15724e34..3d1fbdd75d9a0 100644 --- a/ticdc/deploy-ticdc.md +++ b/ticdc/deploy-ticdc.md @@ -95,7 +95,7 @@ tiup cluster upgrade --transfer-timeout 600 > **Note:** > -> In the preceding command, you need to replace `` and `` with the actual cluster name and cluster version. For example, the version can be v7.1.0. +> In the preceding command, you need to replace `` and `` with the actual cluster name and cluster version. For example, the version can be v7.2.0. ### Upgrade cautions @@ -152,7 +152,7 @@ See [Enable TLS Between TiDB Components](/enable-tls-between-components.md). ## View TiCDC status using the command-line tool -Run the following command to view the TiCDC cluster status. Note that you need to replace `v` with the TiCDC cluster version, such as `v7.1.0`: +Run the following command to view the TiCDC cluster status. Note that you need to replace `v` with the TiCDC cluster version, such as `v7.2.0`: ```shell tiup ctl:v cdc capture list --server=http://10.0.10.25:8300 diff --git a/ticdc/ticdc-changefeed-overview.md b/ticdc/ticdc-changefeed-overview.md index 2c86aaecad885..406c010dcd4a6 100644 --- a/ticdc/ticdc-changefeed-overview.md +++ b/ticdc/ticdc-changefeed-overview.md @@ -38,4 +38,4 @@ You can manage a TiCDC cluster and its replication tasks using the command-line You can also use the HTTP interface (the TiCDC OpenAPI feature) to manage a TiCDC cluster and its replication tasks. For details, see [TiCDC OpenAPI](/ticdc/ticdc-open-api.md). -If your TiCDC is deployed using TiUP, you can start `cdc cli` by running the `tiup ctl:v cdc` command. Replace `v` with the TiCDC cluster version, such as `v7.1.0`. You can also run `cdc cli` directly. +If your TiCDC is deployed using TiUP, you can start `cdc cli` by running the `tiup ctl:v cdc` command. Replace `v` with the TiCDC cluster version, such as `v7.2.0`. You can also run `cdc cli` directly. diff --git a/ticdc/ticdc-open-api-v2.md b/ticdc/ticdc-open-api-v2.md index a714c52f40879..2dc5ed9d19ad6 100644 --- a/ticdc/ticdc-open-api-v2.md +++ b/ticdc/ticdc-open-api-v2.md @@ -92,7 +92,7 @@ curl -X GET http://127.0.0.1:8300/api/v2/status ```json { - "version": "v7.1.0", + "version": "v7.2.0", "git_hash": "10413bded1bdb2850aa6d7b94eb375102e9c44dc", "id": "d2912e63-3349-447c-90ba-72a4e04b5e9e", "pid": 1447, diff --git a/tidb-binlog/get-started-with-tidb-binlog.md b/tidb-binlog/get-started-with-tidb-binlog.md index 9e591eec91e2f..d72c333cf4a65 100644 --- a/tidb-binlog/get-started-with-tidb-binlog.md +++ b/tidb-binlog/get-started-with-tidb-binlog.md @@ -43,7 +43,7 @@ sudo yum install -y mariadb-server ``` ```bash -curl -L https://download.pingcap.org/tidb-community-server-v7.1.0-linux-amd64.tar.gz | tar xzf - +curl -L https://download.pingcap.org/tidb-community-server-v7.2.0-linux-amd64.tar.gz | tar xzf - cd tidb-latest-linux-amd64 ``` diff --git a/tidb-monitoring-api.md b/tidb-monitoring-api.md index 0df977cb6a979..2ad02ab29ca29 100644 --- a/tidb-monitoring-api.md +++ b/tidb-monitoring-api.md @@ -28,7 +28,7 @@ The following example uses `http://${host}:${port}/status` to get the current st curl http://127.0.0.1:10080/status { connections: 0, # The current number of clients connected to the TiDB server. - version: "5.7.25-TiDB-v7.1.0", # The TiDB version number. + version: "5.7.25-TiDB-v7.2.0", # The TiDB version number. git_hash: "778c3f4a5a716880bcd1d71b257c8165685f0d70" # The Git Hash of the current TiDB code. } ``` diff --git a/tiflash/create-tiflash-replicas.md b/tiflash/create-tiflash-replicas.md index cc9dcedbe477f..26e706303228c 100644 --- a/tiflash/create-tiflash-replicas.md +++ b/tiflash/create-tiflash-replicas.md @@ -143,7 +143,7 @@ Before TiFlash replicas are added, each TiKV instance performs a full table scan tiup ctl:v pd -u http://:2379 store limit all engine tiflash 60 add-peer ``` - > In the preceding command, you need to replace `v` with the actual cluster version, such as `v7.1.0` and `:2379` with the address of any PD node. For example: + > In the preceding command, you need to replace `v` with the actual cluster version, such as `v7.2.0` and `:2379` with the address of any PD node. For example: > > ```shell > tiup ctl:v6.1.1 pd -u http://192.168.1.4:2379 store limit all engine tiflash 60 add-peer diff --git a/tiup/tiup-cluster.md b/tiup/tiup-cluster.md index da20b488568a0..57594e77f1d38 100644 --- a/tiup/tiup-cluster.md +++ b/tiup/tiup-cluster.md @@ -17,7 +17,7 @@ tiup cluster ``` ``` -Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.11.3/cluster +Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.12.3/cluster Deploy a TiDB cluster for production Usage: @@ -62,7 +62,7 @@ To deploy the cluster, run the `tiup cluster deploy` command. The usage of the c tiup cluster deploy [flags] ``` -This command requires you to provide the cluster name, the TiDB cluster version (such as `v7.1.0`), and a topology file of the cluster. +This command requires you to provide the cluster name, the TiDB cluster version (such as `v7.2.0`), and a topology file of the cluster. To write a topology file, refer to [the example](https://github.com/pingcap/tiup/blob/master/embed/examples/cluster/topology.example.yaml). The following file is an example of the simplest topology: @@ -119,12 +119,12 @@ tidb_servers: ... ``` -Save the file as `/tmp/topology.yaml`. If you want to use TiDB v7.1.0 and your cluster name is `prod-cluster`, run the following command: +Save the file as `/tmp/topology.yaml`. If you want to use TiDB v7.2.0 and your cluster name is `prod-cluster`, run the following command: {{< copyable "shell-regular" >}} ```shell -tiup cluster deploy -p prod-cluster v7.1.0 /tmp/topology.yaml +tiup cluster deploy -p prod-cluster v7.2.0 /tmp/topology.yaml ``` During the execution, TiUP asks you to confirm your topology again and requires the root password of the target machine (the `-p` flag means inputting password): @@ -132,7 +132,7 @@ During the execution, TiUP asks you to confirm your topology again and requires ```bash Please confirm your topology: TiDB Cluster: prod-cluster -TiDB Version: v7.1.0 +TiDB Version: v7.2.0 Type Host Ports OS/Arch Directories ---- ---- ----- ------- ----------- pd 172.16.5.134 2379/2380 linux/x86_64 deploy/pd-2379,data/pd-2379 @@ -172,10 +172,10 @@ tiup cluster list ``` ``` -Starting /root/.tiup/components/cluster/v1.11.3/cluster list +Starting /root/.tiup/components/cluster/v1.12.3/cluster list Name User Version Path PrivateKey ---- ---- ------- ---- ---------- -prod-cluster tidb v7.1.0 /root/.tiup/storage/cluster/clusters/prod-cluster /root/.tiup/storage/cluster/clusters/prod-cluster/ssh/id_rsa +prod-cluster tidb v7.2.0 /root/.tiup/storage/cluster/clusters/prod-cluster /root/.tiup/storage/cluster/clusters/prod-cluster/ssh/id_rsa ``` ## Start the cluster @@ -203,9 +203,9 @@ tiup cluster display prod-cluster ``` ``` -Starting /root/.tiup/components/cluster/v1.11.3/cluster display prod-cluster +Starting /root/.tiup/components/cluster/v1.12.3/cluster display prod-cluster TiDB Cluster: prod-cluster -TiDB Version: v7.1.0 +TiDB Version: v7.2.0 ID Role Host Ports OS/Arch Status Data Dir Deploy Dir -- ---- ---- ----- ------- ------ -------- ---------- 172.16.5.134:3000 grafana 172.16.5.134 3000 linux/x86_64 Up - deploy/grafana-3000 @@ -277,9 +277,9 @@ tiup cluster display prod-cluster ``` ``` -Starting /root/.tiup/components/cluster/v1.11.3/cluster display prod-cluster +Starting /root/.tiup/components/cluster/v1.12.3/cluster display prod-cluster TiDB Cluster: prod-cluster -TiDB Version: v7.1.0 +TiDB Version: v7.2.0 ID Role Host Ports OS/Arch Status Data Dir Deploy Dir -- ---- ---- ----- ------- ------ -------- ---------- 172.16.5.134:3000 grafana 172.16.5.134 3000 linux/x86_64 Up - deploy/grafana-3000 @@ -390,12 +390,12 @@ Global Flags: -y, --yes Skip all confirmations and assumes 'yes' ``` -For example, the following command upgrades the cluster to v7.1.0: +For example, the following command upgrades the cluster to v7.2.0: {{< copyable "shell-regular" >}} ```bash -tiup cluster upgrade tidb-test v7.1.0 +tiup cluster upgrade tidb-test v7.2.0 ``` ## Update configuration @@ -577,14 +577,14 @@ tiup cluster audit ``` ``` -Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.11.3/cluster audit +Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.12.3/cluster audit ID Time Command -- ---- ------- -4BLhr0 2023-05-31T23:55:09+08:00 /home/tidb/.tiup/components/cluster/v1.11.3/cluster deploy test v7.1.0 /tmp/topology.yaml -4BKWjF 2022-03-029T23:36:57+08:00 /home/tidb/.tiup/components/cluster/v1.11.3/cluster deploy test v7.1.0 /tmp/topology.yaml -4BKVwH 2023-05-31T23:02:08+08:00 /home/tidb/.tiup/components/cluster/v1.11.3/cluster deploy test v7.1.0 /tmp/topology.yaml -4BKKH1 2023-05-31T16:39:04+08:00 /home/tidb/.tiup/components/cluster/v1.11.3/cluster destroy test -4BKKDx 2023-05-31T16:36:57+08:00 /home/tidb/.tiup/components/cluster/v1.11.3/cluster deploy test v7.1.0 /tmp/topology.yaml +4BLhr0 2023-06-29T23:55:09+08:00 /home/tidb/.tiup/components/cluster/v1.12.3/cluster deploy test v7.2.0 /tmp/topology.yaml +4BKWjF 2023-06-29T23:36:57+08:00 /home/tidb/.tiup/components/cluster/v1.12.3/cluster deploy test v7.2.0 /tmp/topology.yaml +4BKVwH 2023-06-29T23:02:08+08:00 /home/tidb/.tiup/components/cluster/v1.12.3/cluster deploy test v7.2.0 /tmp/topology.yaml +4BKKH1 2023-06-29T16:39:04+08:00 /home/tidb/.tiup/components/cluster/v1.12.3/cluster destroy test +4BKKDx 2023-06-29T16:36:57+08:00 /home/tidb/.tiup/components/cluster/v1.12.3/cluster deploy test v7.2.0 /tmp/topology.yaml ``` The first column is `audit-id`. To view the execution log of a certain command, pass the `audit-id` of a command as the flag as follows: @@ -700,7 +700,7 @@ All operations above performed on the cluster machine use the SSH client embedde Then you can use the `--ssh=system` command-line flag to enable the system-native command-line tool: -- Deploy a cluster: `tiup cluster deploy --ssh=system`. Fill in the name of your cluster for ``, the TiDB version to be deployed (such as `v7.1.0`) for ``, and the topology file for ``. +- Deploy a cluster: `tiup cluster deploy --ssh=system`. Fill in the name of your cluster for ``, the TiDB version to be deployed (such as `v7.2.0`) for ``, and the topology file for ``. - Start a cluster: `tiup cluster start --ssh=system` - Upgrade a cluster: `tiup cluster upgrade ... --ssh=system` diff --git a/tiup/tiup-component-cluster-deploy.md b/tiup/tiup-component-cluster-deploy.md index a3d832d91a1cb..b1a5d049f563b 100644 --- a/tiup/tiup-component-cluster-deploy.md +++ b/tiup/tiup-component-cluster-deploy.md @@ -13,7 +13,7 @@ tiup cluster deploy [flags] ``` - ``: the name of the new cluster, which cannot be the same as the existing cluster names. -- ``: the version number of the TiDB cluster to deploy, such as `v7.1.0`. +- ``: the version number of the TiDB cluster to deploy, such as `v7.2.0`. - ``: the prepared [topology file](/tiup/tiup-cluster-topology-reference.md). ## Options diff --git a/tiup/tiup-component-cluster-patch.md b/tiup/tiup-component-cluster-patch.md index cc8910479084e..666fe0343b1e5 100644 --- a/tiup/tiup-component-cluster-patch.md +++ b/tiup/tiup-component-cluster-patch.md @@ -28,7 +28,7 @@ Before running the `tiup cluster patch` command, you need to pack the binary pac 1. Determine the following variables: - `${component}`: the name of the component to be replaced (such as `tidb`, `tikv`, or `pd`). - - `${version}`: the version of the component (such as `v7.1.0` or `v6.5.1`). + - `${version}`: the version of the component (such as `v7.2.0` or `v6.5.1`). - `${os}`: the operating system (`linux`). - `${arch}`: the platform on which the component runs (`amd64`, `arm64`). diff --git a/tiup/tiup-component-cluster-upgrade.md b/tiup/tiup-component-cluster-upgrade.md index cc81f79d6b726..8b0a5189a405d 100644 --- a/tiup/tiup-component-cluster-upgrade.md +++ b/tiup/tiup-component-cluster-upgrade.md @@ -13,7 +13,7 @@ tiup cluster upgrade [flags] ``` - ``: the cluster name to operate on. If you forget the cluster name, you can check it with the [cluster list](/tiup/tiup-component-cluster-list.md) command. -- ``: the target version to upgrade to, such as `v7.1.0`. Currently, it is only allowed to upgrade to a version higher than the current cluster, that is, no downgrade is allowed. It is also not allowed to upgrade to the nightly version. +- ``: the target version to upgrade to, such as `v7.2.0`. Currently, it is only allowed to upgrade to a version higher than the current cluster, that is, no downgrade is allowed. It is also not allowed to upgrade to the nightly version. ## Options diff --git a/tiup/tiup-component-dm-upgrade.md b/tiup/tiup-component-dm-upgrade.md index 4eb30e95d014d..831871a66c092 100644 --- a/tiup/tiup-component-dm-upgrade.md +++ b/tiup/tiup-component-dm-upgrade.md @@ -13,7 +13,7 @@ tiup dm upgrade [flags] ``` - `` is the name of the cluster to be operated on. If you forget the cluster name, you can check it using the [`tiup dm list`](/tiup/tiup-component-dm-list.md) command. -- `` is the target version to be upgraded to, such as `v7.1.0`. Currently, only upgrading to a later version is allowed, and upgrading to an earlier version is not allowed, which means the downgrade is not allowed. Upgrading to a nightly version is not allowed either. +- `` is the target version to be upgraded to, such as `v7.2.0`. Currently, only upgrading to a later version is allowed, and upgrading to an earlier version is not allowed, which means the downgrade is not allowed. Upgrading to a nightly version is not allowed either. ## Options diff --git a/tiup/tiup-component-management.md b/tiup/tiup-component-management.md index 4f9ffefb71e13..c1bea5a11cf23 100644 --- a/tiup/tiup-component-management.md +++ b/tiup/tiup-component-management.md @@ -70,12 +70,12 @@ Example 2: Use TiUP to install the nightly version of TiDB. tiup install tidb:nightly ``` -Example 3: Use TiUP to install TiKV v7.1.0. +Example 3: Use TiUP to install TiKV v7.2.0. {{< copyable "shell-regular" >}} ```shell -tiup install tikv:v7.1.0 +tiup install tikv:v7.2.0 ``` ## Upgrade components @@ -128,12 +128,12 @@ Before the component is started, TiUP creates a directory for it, and then puts If you want to start the same component multiple times and reuse the previous working directory, you can use `--tag` to specify the same name when the component is started. After the tag is specified, the working directory will *not be automatically deleted* when the instance is terminated, which makes it convenient to reuse the working directory. -Example 1: Operate TiDB v7.1.0. +Example 1: Operate TiDB v7.2.0. {{< copyable "shell-regular" >}} ```shell -tiup tidb:v7.1.0 +tiup tidb:v7.2.0 ``` Example 2: Specify the tag with which TiKV operates. @@ -219,12 +219,12 @@ The following flags are supported in this command: - If the version is ignored, adding `--all` means to uninstall all versions of this component. - If the version and the component are both ignored, adding `--all` means to uninstall all components of all versions. -Example 1: Uninstall TiDB v7.1.0. +Example 1: Uninstall TiDB v7.2.0. {{< copyable "shell-regular" >}} ```shell -tiup uninstall tidb:v7.1.0 +tiup uninstall tidb:v7.2.0 ``` Example 2: Uninstall TiKV of all versions. diff --git a/tiup/tiup-mirror.md b/tiup/tiup-mirror.md index d311c3d0987d5..23c181d7ce698 100644 --- a/tiup/tiup-mirror.md +++ b/tiup/tiup-mirror.md @@ -87,9 +87,9 @@ The `tiup mirror clone` command provides many optional flags (might provide more If you want to clone only one version (not all versions) of a component, use `--=` to specify this version. For example: - - Execute the `tiup mirror clone --tidb v7.1.0` command to clone the v7.1.0 version of the TiDB component. - - Run the `tiup mirror clone --tidb v7.1.0 --tikv all` command to clone the v7.1.0 version of the TiDB component and all versions of the TiKV component. - - Run the `tiup mirror clone v7.1.0` command to clone the v7.1.0 version of all components in a cluster. + - Execute the `tiup mirror clone --tidb v7.2.0` command to clone the v7.2.0 version of the TiDB component. + - Run the `tiup mirror clone --tidb v7.2.0 --tikv all` command to clone the v7.2.0 version of the TiDB component and all versions of the TiKV component. + - Run the `tiup mirror clone v7.2.0` command to clone the v7.2.0 version of all components in a cluster. After cloning, signing keys are set up automatically. diff --git a/tiup/tiup-playground.md b/tiup/tiup-playground.md index 6248bff396605..4381609d880ce 100644 --- a/tiup/tiup-playground.md +++ b/tiup/tiup-playground.md @@ -20,9 +20,9 @@ If you directly execute the `tiup playground` command, TiUP uses the locally ins This command actually performs the following operations: -- Because this command does not specify the version of the playground component, TiUP first checks the latest version of the installed playground component. Assume that the latest version is v1.11.3, then this command works the same as `tiup playground:v1.11.3`. +- Because this command does not specify the version of the playground component, TiUP first checks the latest version of the installed playground component. Assume that the latest version is v1.12.3, then this command works the same as `tiup playground:v1.12.3`. - If you have not used TiUP playground to install the TiDB, TiKV, and PD components, the playground component installs the latest stable version of these components, and then start these instances. -- Because this command does not specify the version of the TiDB, PD, and TiKV component, TiUP playground uses the latest version of each component by default. Assume that the latest version is v7.1.0, then this command works the same as `tiup playground:v1.11.3 v7.1.0`. +- Because this command does not specify the version of the TiDB, PD, and TiKV component, TiUP playground uses the latest version of each component by default. Assume that the latest version is v7.2.0, then this command works the same as `tiup playground:v1.12.3 v7.2.0`. - Because this command does not specify the number of each component, TiUP playground, by default, starts a smallest cluster that consists of one TiDB instance, one TiKV instance, one PD instance, and one TiFlash instance. - After starting each TiDB component, TiUP playground reminds you that the cluster is successfully started and provides you some useful information, such as how to connect to the TiDB cluster through the MySQL client and how to access the [TiDB Dashboard](/dashboard/dashboard-intro.md). diff --git a/upgrade-tidb-using-tiup.md b/upgrade-tidb-using-tiup.md index 700c2d4192100..d3dfd44146d25 100644 --- a/upgrade-tidb-using-tiup.md +++ b/upgrade-tidb-using-tiup.md @@ -8,10 +8,10 @@ aliases: ['/docs/dev/upgrade-tidb-using-tiup/','/docs/dev/how-to/upgrade/using-t This document is targeted for the following upgrade paths: -- Upgrade from TiDB 4.0 versions to TiDB 7.1. -- Upgrade from TiDB 5.0-5.4 versions to TiDB 7.1. -- Upgrade from TiDB 6.0-6.6 to TiDB 7.1. -- Upgrade from TiDB 7.0 to TiDB 7.1. +- Upgrade from TiDB 4.0 versions to TiDB 7.2. +- Upgrade from TiDB 5.0-5.4 versions to TiDB 7.2. +- Upgrade from TiDB 6.0-6.6 to TiDB 7.2. +- Upgrade from TiDB 7.0-7.1 to TiDB 7.2. > **Warning:** > @@ -23,17 +23,17 @@ This document is targeted for the following upgrade paths: > **Note:** > -> If your cluster to be upgraded is v3.1 or an earlier version (v3.0 or v2.1), the direct upgrade to v7.1.0 is not supported. You need to upgrade your cluster first to v4.0 and then to v7.1.0. +> If your cluster to be upgraded is v3.1 or an earlier version (v3.0 or v2.1), the direct upgrade to v7.2.0 is not supported. You need to upgrade your cluster first to v4.0 and then to v7.2.0. ## Upgrade caveat - TiDB currently does not support version downgrade or rolling back to an earlier version after the upgrade. -- For the v4.0 cluster managed using TiDB Ansible, you need to import the cluster to TiUP (`tiup cluster`) for new management according to [Upgrade TiDB Using TiUP (v4.0)](https://docs.pingcap.com/tidb/v4.0/upgrade-tidb-using-tiup#import-tidb-ansible-and-the-inventoryini-configuration-to-tiup). Then you can upgrade the cluster to v7.1.0 according to this document. -- To update versions earlier than v3.0 to v7.1.0: +- For the v4.0 cluster managed using TiDB Ansible, you need to import the cluster to TiUP (`tiup cluster`) for new management according to [Upgrade TiDB Using TiUP (v4.0)](https://docs.pingcap.com/tidb/v4.0/upgrade-tidb-using-tiup#import-tidb-ansible-and-the-inventoryini-configuration-to-tiup). Then you can upgrade the cluster to v7.2.0 according to this document. +- To update versions earlier than v3.0 to v7.2.0: 1. Update this version to 3.0 using [TiDB Ansible](https://docs.pingcap.com/tidb/v3.0/upgrade-tidb-using-ansible). 2. Use TiUP (`tiup cluster`) to import the TiDB Ansible configuration. 3. Update the 3.0 version to 4.0 according to [Upgrade TiDB Using TiUP (v4.0)](https://docs.pingcap.com/tidb/v4.0/upgrade-tidb-using-tiup#import-tidb-ansible-and-the-inventoryini-configuration-to-tiup). - 4. Upgrade the cluster to v7.1.0 according to this document. + 4. Upgrade the cluster to v7.2.0 according to this document. - Support upgrading the versions of TiDB Binlog, TiCDC, TiFlash, and other components. - When upgrading TiFlash from versions earlier than v6.3.0 to v6.3.0 and later versions, note that the CPU must support the AVX2 instruction set under the Linux AMD64 architecture and the ARMv8 instruction set architecture under the Linux ARM64 architecture. For details, see the description in [v6.3.0 Release Notes](/releases/release-6.3.0.md#others). - For detailed compatibility changes of different versions, see the [Release Notes](/releases/release-notes.md) of each version. Modify your cluster configuration according to the "Compatibility Changes" section of the corresponding release notes. @@ -120,7 +120,7 @@ Now, the offline mirror has been upgraded successfully. If an error occurs durin > Skip this step if one of the following situations applies: > > + You have not modified the configuration parameters of the original cluster. Or you have modified the configuration parameters using `tiup cluster` but no more modification is needed. -> + After the upgrade, you want to use v7.1.0's default parameter values for the unmodified configuration items. +> + After the upgrade, you want to use v7.2.0's default parameter values for the unmodified configuration items. 1. Enter the `vi` editing mode to edit the topology file: @@ -136,7 +136,7 @@ Now, the offline mirror has been upgraded successfully. If an error occurs durin > **Note:** > -> Before you upgrade the cluster to v6.6.0, make sure that the parameters you have modified in v4.0 are compatible in v7.1.0. For details, see [TiKV Configuration File](/tikv-configuration-file.md). +> Before you upgrade the cluster to v6.6.0, make sure that the parameters you have modified in v4.0 are compatible in v7.2.0. For details, see [TiKV Configuration File](/tikv-configuration-file.md). ### Step 4: Check the health status of the current cluster @@ -180,12 +180,12 @@ If your application has a maintenance window for the database to be stopped for tiup cluster upgrade ``` -For example, if you want to upgrade the cluster to v7.1.0: +For example, if you want to upgrade the cluster to v7.2.0: {{< copyable "shell-regular" >}} ```shell -tiup cluster upgrade v7.1.0 +tiup cluster upgrade v7.2.0 ``` > **Note:** @@ -213,7 +213,7 @@ tiup cluster upgrade v7.1.0 tiup cluster stop ``` -2. Use the `upgrade` command with the `--offline` option to perform the offline upgrade. Fill in the name of your cluster for `` and the version to upgrade to for ``, such as `v7.1.0`. +2. Use the `upgrade` command with the `--offline` option to perform the offline upgrade. Fill in the name of your cluster for `` and the version to upgrade to for ``, such as `v7.2.0`. {{< copyable "shell-regular" >}} @@ -242,7 +242,7 @@ tiup cluster display ``` Cluster type: tidb Cluster name: -Cluster version: v7.1.0 +Cluster version: v7.2.0 ``` ## FAQ @@ -273,7 +273,7 @@ Re-execute the `tiup cluster upgrade` command to resume the upgrade. The upgrade ### The evict leader has waited too long during the upgrade. How to skip this step for a quick upgrade? -You can specify `--force`. Then the processes of transferring PD leader and evicting TiKV leader are skipped during the upgrade. The cluster is directly restarted to update the version, which has a great impact on the cluster that runs online. In the following command, `` is the version to upgrade to, such as `v7.1.0`. +You can specify `--force`. Then the processes of transferring PD leader and evicting TiKV leader are skipped during the upgrade. The cluster is directly restarted to update the version, which has a great impact on the cluster that runs online. In the following command, `` is the version to upgrade to, such as `v7.2.0`. {{< copyable "shell-regular" >}} @@ -288,5 +288,5 @@ You can upgrade the tool version by using TiUP to install the `ctl` component of {{< copyable "shell-regular" >}} ```shell -tiup install ctl:v7.1.0 +tiup install ctl:v7.2.0 ``` From fc8d8ca1fe9b2eaaeb7d34d4000d8c9bdc33d2a3 Mon Sep 17 00:00:00 2001 From: Chao Zheng Date: Thu, 29 Jun 2023 00:34:43 -0700 Subject: [PATCH 16/30] fix cdc changefeed config (#14036) --- ticdc/ticdc-changefeed-config.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index 90799c43981d6..f2afa1a4d798f 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -183,7 +183,7 @@ enable-partition-separator = true # Specifies the replication consistency configurations for a changefeed when using the redo log. For more information, see https://docs.pingcap.com/tidb/stable/ticdc-sink-to-mysql#eventually-consistent-replication-in-disaster-scenarios. # Note: The consistency-related configuration items only take effect when the downstream is a database and the redo log feature is enabled. -[sink.consistent] +[consistent] # The data consistency level. Available options are "none" and "eventual". "none" means that the redo log is disabled. # The default value is "none". level = "none" From ffbe388057577fa039a8bb9b9e149684cbfd97f0 Mon Sep 17 00:00:00 2001 From: Ran Date: Fri, 30 Jun 2023 11:56:11 +0800 Subject: [PATCH 17/30] update the supported macos version (#14056) --- hardware-and-software-requirements.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/hardware-and-software-requirements.md b/hardware-and-software-requirements.md index 5555067ea4d75..586e45be75fb4 100644 --- a/hardware-and-software-requirements.md +++ b/hardware-and-software-requirements.md @@ -24,16 +24,16 @@ As an open-source distributed SQL database with high performance, TiDB can be de |
  • Red Hat Enterprise Linux 7.3 or a later 7.x version
  • CentOS 7.3 or a later 7.x version
|
  • x86_64
  • ARM 64
| | Amazon Linux 2 |
  • x86_64
  • ARM 64
| | Kylin Euler V10 SP1/SP2 |
  • x86_64
  • ARM 64
| -| UOS V20 |
  • x86_64
  • ARM 64
| +| UOS V20 |
  • x86_64
  • ARM 64
| | openEuler 22.03 LTS SP1 | x86_64 | -| macOS Catalina or later |
  • x86_64
  • ARM 64
| -| Oracle Enterprise Linux 7.3 or a later 7.x version | x86_64 | -| Ubuntu LTS 18.04 or later | x86_64 | +| macOS 12 (Monterey) or later |
  • x86_64
  • ARM 64
| +| Oracle Enterprise Linux 7.3 or a later 7.x version | x86_64 | +| Ubuntu LTS 18.04 or later | x86_64 | | CentOS 8 Stream |
  • x86_64
  • ARM 64
| -| Debian 9 (Stretch) or later | x86_64 | -| Fedora 35 or later | x86_64 | -| openSUSE Leap later than v15.3 (not including Tumbleweed) | x86_64 | -| SUSE Linux Enterprise Server 15 | x86_64 | +| Debian 9 (Stretch) or later | x86_64 | +| Fedora 35 or later | x86_64 | +| openSUSE Leap later than v15.3 (not including Tumbleweed) | x86_64 | +| SUSE Linux Enterprise Server 15 | x86_64 | > **Note:** > From 236611ec3b095e3cc7604a2087db2eac78ee8dae Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 30 Jun 2023 23:29:13 +0800 Subject: [PATCH 18/30] add the link of v7.2 release notes to TiDB roadmap (#14057) --- README.md | 1 + tidb-roadmap.md | 1 + 2 files changed, 2 insertions(+) diff --git a/README.md b/README.md index e4fcda6b1173e..da89e7f6b56ea 100644 --- a/README.md +++ b/README.md @@ -24,6 +24,7 @@ Currently, we maintain the following versions of TiDB documentation in different | Branch name | TiDB docs version | | :---------|:----------| | [`master`](https://github.com/pingcap/docs/tree/master) | The latest development version | +| [`release-7.2`](https://github.com/pingcap/docs/tree/release-7.2) | 7.2 Development Milestone Release | | [`release-7.1`](https://github.com/pingcap/docs/tree/release-7.1) | 7.1 LTS (Long-Term Support) version | | [`release-7.0`](https://github.com/pingcap/docs/tree/release-7.0) | 7.0 Development Milestone Release | | [`release-6.6`](https://github.com/pingcap/docs/tree/release-6.6) | 6.6 Development Milestone Release | diff --git a/tidb-roadmap.md b/tidb-roadmap.md index 9fd5c6a329bbf..e0ccf433f3134 100644 --- a/tidb-roadmap.md +++ b/tidb-roadmap.md @@ -272,6 +272,7 @@ You might have been waiting on some items from the last version. The following l ## Recently shipped +- [TiDB 7.2.0 Release Notes](https://docs.pingcap.com/tidb/v7.2/release-7.2.0) - [TiDB 7.1.0 Release Notes](https://docs.pingcap.com/tidb/v7.1/release-7.1.0) - [TiDB 7.0.0 Release Notes](https://docs.pingcap.com/tidb/v7.0/release-7.0.0) - [TiDB 6.6.0 Release Notes](https://docs.pingcap.com/tidb/v6.6/release-6.6.0) From a59fa29eaf84891e7f4fa70405e14e7346139143 Mon Sep 17 00:00:00 2001 From: Ran Date: Mon, 3 Jul 2023 09:30:42 +0800 Subject: [PATCH 19/30] add encode password description for ticdc doc (#14042) --- ticdc/ticdc-open-api.md | 2 +- ticdc/ticdc-sink-to-kafka.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/ticdc/ticdc-open-api.md b/ticdc/ticdc-open-api.md index 2359abf8a77dd..38b36c95846c2 100644 --- a/ticdc/ticdc-open-api.md +++ b/ticdc/ticdc-open-api.md @@ -167,7 +167,7 @@ The configuration parameters of sink are as follows: ### Example -The following request creates a replication task with an ID of `test5` and a `sink_uri` of `blackhome://`. +The following request creates a replication task with an ID of `test5` and a `sink_uri` of `blackhole://`. {{< copyable "shell-regular" >}} diff --git a/ticdc/ticdc-sink-to-kafka.md b/ticdc/ticdc-sink-to-kafka.md index 21b866c7fb0c3..8e6533bbd8bae 100644 --- a/ticdc/ticdc-sink-to-kafka.md +++ b/ticdc/ticdc-sink-to-kafka.md @@ -69,14 +69,14 @@ The following are descriptions of sink URI parameters and values that can be con | `key` | The path of the certificate key file needed to connect to the downstream Kafka instance (optional). | | `insecure-skip-verify` | Whether to skip certificate verification when connecting to the downstream Kafka instance (optional, `false` by default). | | `sasl-user` | The identity (authcid) of SASL/PLAIN or SASL/SCRAM authentication needed to connect to the downstream Kafka instance (optional). | -| `sasl-password` | The password of SASL/PLAIN or SASL/SCRAM authentication needed to connect to the downstream Kafka instance (optional). | +| `sasl-password` | The password of SASL/PLAIN or SASL/SCRAM authentication needed to connect to the downstream Kafka instance (optional). If it contains special characters, they need to be URL encoded. | | `sasl-mechanism` | The name of SASL authentication needed to connect to the downstream Kafka instance. The value can be `plain`, `scram-sha-256`, `scram-sha-512`, or `gssapi`. | | `sasl-gssapi-auth-type` | The gssapi authentication type. Values can be `user` or `keytab` (optional). | | `sasl-gssapi-keytab-path` | The gssapi keytab path (optional).| | `sasl-gssapi-kerberos-config-path` | The gssapi kerberos configuration path (optional). | | `sasl-gssapi-service-name` | The gssapi service name (optional). | | `sasl-gssapi-user` | The user name of gssapi authentication (optional). | -| `sasl-gssapi-password` | The password of gssapi authentication (optional). | +| `sasl-gssapi-password` | The password of gssapi authentication (optional). If it contains special characters, they need to be URL encoded. | | `sasl-gssapi-realm` | The gssapi realm name (optional). | | `sasl-gssapi-disable-pafxfast` | Whether to disable the gssapi PA-FX-FAST (optional). | | `dial-timeout` | The timeout in establishing a connection with the downstream Kafka. The default value is `10s`. | From b757eb595a3d84421bff829b77389596cbb04708 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Mon, 3 Jul 2023 09:38:40 +0800 Subject: [PATCH 20/30] Update release_notes_update_pr_author_info_add_dup.py (#13793) --- .../release_notes_update_pr_author_info_add_dup.py | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/scripts/release_notes_update_pr_author_info_add_dup.py b/scripts/release_notes_update_pr_author_info_add_dup.py index 021537ef56488..4a443365b3614 100644 --- a/scripts/release_notes_update_pr_author_info_add_dup.py +++ b/scripts/release_notes_update_pr_author_info_add_dup.py @@ -57,10 +57,15 @@ def store_exst_rn(ext_path,main_path): def get_pr_info_from_github(cp_pr_link,cp_pr_title): target_repo_pr_link= cp_pr_link.rsplit('/', 1)[0] - target_pr_number = re.findall(r'\(#(\d+)\)$', cp_pr_title) + target_pr_number = re.findall(r'\(#(\d+)\)$', cp_pr_title) # Match the original PR number in the end of the cherry-pick PR - if len(target_pr_number) > 1: - print ("There is more than one match result of original PR number from the cherry-pick title: " + cp_pr_title ) + if target_pr_number: + if len(target_pr_number) > 1: + print ("There is more than one match result of original PR number from the cherry-pick title: " + cp_pr_title ) + else: + pass + else: + target_pr_number = re.findall(r'\(#(\d+)\)', cp_pr_title) # Match the original PR number in the cherry-pick PR target_pr_link = target_repo_pr_link + '/' + target_pr_number[0] @@ -113,7 +118,7 @@ def update_pr_author_and_release_notes(excel_path): pass ## Add the dup release note info - issue_link = re.search('https://github.com/(pingcap|tikv)/\w+/issues/\d+', current_formated_rn) + issue_link = re.search('https://github.com/(pingcap|tikv)/[\w-]+/issues/\d+', current_formated_rn) for note_pair in note_pairs: if (issue_link.group() == note_pair[0]) and ((current_pr_author in note_pair[4]) or len(note_pair[4]) == 0): # Add the dup release notes only if the issues link is the same as the existing one and the current author is in the existing author list print('A duplicated note is found in row ' + str(row_index) + " from " + note_pair[2] + note_pair[1]) From 993226aae605cc46b74781c18769ae3eddd62fac Mon Sep 17 00:00:00 2001 From: Ran Date: Mon, 3 Jul 2023 09:47:12 +0800 Subject: [PATCH 21/30] Update command-line-flags-for-tidb-configuration.md (#14038) --- command-line-flags-for-tidb-configuration.md | 2 +- tidb-configuration-file.md | 5 +++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/command-line-flags-for-tidb-configuration.md b/command-line-flags-for-tidb-configuration.md index 947cb58eefa4a..e4b5b936f2e7e 100644 --- a/command-line-flags-for-tidb-configuration.md +++ b/command-line-flags-for-tidb-configuration.md @@ -114,7 +114,7 @@ When you start the TiDB cluster, you can use command-line options or environment ## `--proxy-protocol-fallbackable` -- Controls whether to enable PROXY protocol fallback mode. When this parameter is set to `true`, TiDB accepts PROXY client connections and client connections without any PROXY protocol header. By default, TiDB only accepts client connections with a PROXY protocol header. +- Controls whether to enable PROXY protocol fallback mode. When this parameter is set to `true`, TiDB accepts client connections that belong to `--proxy-protocol-networks` without using the PROXY protocol specification or without sending a PROXY protocol header. By default, TiDB only accepts client connections that belong to `--proxy-protocol-networks` and send a PROXY protocol header. - Default value: `false` ## `--proxy-protocol-networks` diff --git a/tidb-configuration-file.md b/tidb-configuration-file.md index 0f5525bb20982..74ba6c4bd7785 100644 --- a/tidb-configuration-file.md +++ b/tidb-configuration-file.md @@ -929,6 +929,11 @@ Configuration items related to the PROXY protocol. > > Use `*` with caution because it might introduce security risks by allowing a client of any IP address to report its IP address. In addition, using `*` might also cause the internal component that directly connects to TiDB (such as TiDB Dashboard) to be unavailable. +### `fallbackable` New in v6.5.1 + ++ Controls whether to enable the PROXY protocol fallback mode. If this configuration item is set to `true`, TiDB can accept clients that belong to `proxy-protocol.networks` to connect to TiDB without using the PROXY protocol specification or without sending the PROXY protocol header. By default, TiDB only accepts client connections that belong to `proxy-protocol.networks` and send a PROXY protocol header. ++ Default value: `false` + ## experimental The `experimental` section, introduced in v3.1.0, describes the configurations related to the experimental features of TiDB. From 91e4a6f924ffecfe32d32fddd8e40601f4d598e2 Mon Sep 17 00:00:00 2001 From: Ran Date: Mon, 3 Jul 2023 09:54:42 +0800 Subject: [PATCH 22/30] ticdc: add a faq for ticdc (#14039) --- ticdc/ticdc-faq.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/ticdc/ticdc-faq.md b/ticdc/ticdc-faq.md index 572e474320d8b..3e11e0160916e 100644 --- a/ticdc/ticdc-faq.md +++ b/ticdc/ticdc-faq.md @@ -297,3 +297,33 @@ This feature is currently not supported, which might be supported in a future re ## Does TiCDC replication get stuck if the upstream has long-running uncommitted transactions? TiDB has a transaction timeout mechanism. When a transaction runs for a period longer than [`max-txn-ttl`](/tidb-configuration-file.md#max-txn-ttl), TiDB forcibly rolls it back. TiCDC waits for the transaction to be committed before proceeding with the replication, which causes replication delay. + +## Why can't I use the `cdc cli` command to operate a TiCDC cluster deployed by TiDB Operator? + +This is because the default port number of the TiCDC cluster deployed by TiDB Operator is `8301`, while the default port number of the `cdc cli` command to connect to the TiCDC server is `8300`. When using the `cdc cli` command to operate the TiCDC cluster deployed by TiDB Operator, you need to explicitly specify the `--server` parameter, as follows: + +```shell +./cdc cli changefeed list --server "127.0.0.1:8301" +[ + { + "id": "4k-table", + "namespace": "default", + "summary": { + "state": "stopped", + "tso": 441832628003799353, + "checkpoint": "2023-05-30 22:41:57.910", + "error": null + } + }, + { + "id": "big-table", + "namespace": "default", + "summary": { + "state": "normal", + "tso": 441872834546892882, + "checkpoint": "2023-06-01 17:18:13.700", + "error": null + } + } +] +``` From 7c07485efc528f0c165b01c1714b84766cc29f5f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E4=BA=8C=E6=89=8B=E6=8E=89=E5=8C=85=E5=B7=A5=E7=A8=8B?= =?UTF-8?q?=E5=B8=88?= Date: Mon, 3 Jul 2023 10:15:42 +0800 Subject: [PATCH 23/30] ticdc: add kafka oauth configurations (#13849) --- ticdc/ticdc-changefeed-config.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index f2afa1a4d798f..171e00f32ea74 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -204,4 +204,21 @@ use-file-backend = false integrity-check-level = "none" # Specifies the log level of the Changefeed when the checksum validation for single-row data fails. The default value is "warn". Value options are "warn" and "error". corruption-handle-level = "warn" + +# The following configuration items only take effect when the downstream is Kafka. +[sink.kafka-config] +# The mechanism of Kafka SASL authentication. The default value is empty, indicating that SASL authentication is not used. +sasl-mechanism = "OAUTHBEARER" +# The client-id in the Kafka SASL OAUTHBEARER authentication. The default value is empty. This parameter is required when the OAUTHBEARER authentication is used. +sasl-oauth-client-id = "producer-kafka" +# The client-secret in the Kafka SASL OAUTHBEARER authentication. The default value is empty. This parameter is required when the OAUTHBEARER authentication is used. +sasl-oauth-client-secret = "cHJvZHVjZXIta2Fma2E=" +# The token-url in the Kafka SASL OAUTHBEARER authentication to obtain the token. The default value is empty. This parameter is required when the OAUTHBEARER authentication is used. +sasl-oauth-token-url = "http://127.0.0.1:4444/oauth2/token" +# The scopes in the Kafka SASL OAUTHBEARER authentication. The default value is empty. This parameter is optional when the OAUTHBEARER authentication is used. +sasl-oauth-scopes = ["producer.kafka", "consumer.kafka"] +# The grant-type in the Kafka SASL OAUTHBEARER authentication. The default value is "client_credentials". This parameter is optional when the OAUTHBEARER authentication is used. +sasl-oauth-grant-type = "client_credentials" +# The audience in the Kafka SASL OAUTHBEARER authentication. The default value is empty. This parameter is optional when the OAUTHBEARER authentication is used. +sasl-oauth-audience = "kafka" ``` From 743190858438eded9929b1fd27279c2ef1e07813 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Tue, 4 Jul 2023 10:53:13 +0800 Subject: [PATCH 24/30] resource_control: ru consistent with serverless (#14081) --- tidb-resource-control.md | 55 ++++++++++++++++++++++++++++++---------- 1 file changed, 41 insertions(+), 14 deletions(-) diff --git a/tidb-resource-control.md b/tidb-resource-control.md index 9fee7b5389199..1aa9d067bfe8e 100644 --- a/tidb-resource-control.md +++ b/tidb-resource-control.md @@ -43,21 +43,48 @@ Currently, the resource control feature has the following limitations: ## What is Request Unit (RU) -Request Unit (RU) is a unified abstraction unit in TiDB for system resources, which currently includes CPU, IOPS, and IO bandwidth metrics. The consumption of these three metrics is represented by RU according to a certain ratio. +Request Unit (RU) is a unified abstraction unit in TiDB for system resources, which currently includes CPU, IOPS, and IO bandwidth metrics. It is used to indicate the amount of resources consumed by a single request to the database. The number of RUs consumed by a request depends on a variety of factors, such as the type of operations, and the amount of data being queried or modified. Currently, the RU contains consumption statistics for the resources in the following table: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Resource typeRU consumption
Read2 storage read batches consume 1 RU
8 storage read requests consume 1 RU
64 KiB read request payload consumes 1 RU
Write1 storage write batch consumes 1 RU for each replica
1 storage write request consumes 1 RU
1 KiB write request payload consumes 1 RU
SQL CPU 3 ms consumes 1 RU
-The following table shows the consumption of TiKV storage layer CPU and IO resources by user requests and the corresponding RU weights. - -| Resource | RU Weight | -|:----------------|:-----------------| -| CPU | 1/3 RU per millisecond | -| Read IO | 1/64 RU per KB | -| Write IO | 1 RU/KB | -| Basic overhead of a read request | 0.25 RU | -| Basic overhead of a write request | 1.5 RU | - -Based on the above table, assuming that the TiKV time consumed by a resource group is `c` milliseconds, `r1` times of requests read `r2` KB data, `w1` times of write requests write `w2` KB data, and the number of non-witness TiKV nodes in the cluster is `n`. Then, the formula for the total RUs consumed by the resource group is as follows: - -`c`\* 1/3 + (`r1` \* 0.25 + `r2` \* 1/64) + (1.5 \* `w1` + `w2` \* 1 \* `n`) +> **Note:** +> +> - Each write operation is eventually replicated to all replicas (by default, TiKV has 3 replicas). Each replication operation is considered a different write operation. +> - In addition to queries executed by users, RU can be consumed by background tasks, such as automatic statistics collection. +> - The preceding table lists only the resources involved in RU calculation for TiDB Self-Hosted clusters, excluding the network and storage consumption. For TiDB Serverless RUs, see [TiDB Serverless Pricing Details](https://www.pingcap.com/tidb-cloud-serverless-pricing-details/). ## Parameters for resource control From 92ba570a97eb7d3546fcc4efe4bb9f58ae16ad92 Mon Sep 17 00:00:00 2001 From: Aolin Date: Tue, 4 Jul 2023 13:38:14 +0800 Subject: [PATCH 25/30] br: fix the backup code example (#14102) --- br/backup-and-restore-storages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/br/backup-and-restore-storages.md b/br/backup-and-restore-storages.md index aedff53e000e0..938db1bf4565d 100644 --- a/br/backup-and-restore-storages.md +++ b/br/backup-and-restore-storages.md @@ -94,7 +94,7 @@ This section provides some URI examples by using `external` as the `host` parame **Back up snapshot data to Amazon S3** ```shell -./br restore full -u "${PD_IP}:2379" \ +./br backup full -u "${PD_IP}:2379" \ --storage "s3://external/backup-20220915?access-key=${access-key}&secret-access-key=${secret-access-key}" ``` From 347d8ce7aa8141fb1b3a3ce137d07ac5591eabf3 Mon Sep 17 00:00:00 2001 From: Roger Song Date: Tue, 4 Jul 2023 19:08:14 +0800 Subject: [PATCH 26/30] include warning in tidb control (#14041) --- tidb-control.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tidb-control.md b/tidb-control.md index 6447ffe67eb4c..921a27a761ded 100644 --- a/tidb-control.md +++ b/tidb-control.md @@ -8,6 +8,10 @@ aliases: ['/docs/dev/tidb-control/','/docs/dev/reference/tools/tidb-control/'] TiDB Control is a command-line tool of TiDB, usually used to obtain the status information of TiDB for debugging. This document introduces the features of TiDB Control and how to use these features. +> **Note:** +> +> TiDB Control is specifically designed for debugging purposes and might not be fully compatible with future capabilities introduced in TiDB. It's not recommended to include this tool in applications or utilities development to get information. + ## Get TiDB Control You can get TiDB Control by installing it using TiUP or by compiling it from source code. From 7f0cb75993a33896a6cbd83844d0d8837c427bf9 Mon Sep 17 00:00:00 2001 From: Ran Date: Wed, 5 Jul 2023 09:42:44 +0800 Subject: [PATCH 27/30] *: remove serverless beta from urls (#14095) --- develop/dev-guide-aws-appflow-integration.md | 2 +- develop/dev-guide-build-cluster-in-cloud.md | 4 ++-- .../information-schema-resource-groups.md | 2 +- information-schema/information-schema-slow-query.md | 4 ++-- releases/release-7.0.0.md | 2 +- sql-statements/sql-statement-alter-resource-group.md | 2 +- sql-statements/sql-statement-calibrate-resource.md | 4 ++-- .../sql-statement-create-resource-group.md | 2 +- sql-statements/sql-statement-drop-resource-group.md | 2 +- .../sql-statement-flashback-to-timestamp.md | 2 +- sql-statements/sql-statement-load-data.md | 2 +- sql-statements/sql-statement-set-resource-group.md | 2 +- .../sql-statement-show-create-resource-group.md | 2 +- statement-summary-tables.md | 4 ++-- system-variables.md | 2 +- tidb-cloud/changefeed-sink-to-mysql.md | 12 ++++++------ tidb-cloud/tune-performance.md | 4 ++-- tidb-resource-control.md | 4 ++-- time-to-live.md | 2 +- 19 files changed, 30 insertions(+), 30 deletions(-) diff --git a/develop/dev-guide-aws-appflow-integration.md b/develop/dev-guide-aws-appflow-integration.md index 1b2ff2a604b4c..1305234a01a60 100644 --- a/develop/dev-guide-aws-appflow-integration.md +++ b/develop/dev-guide-aws-appflow-integration.md @@ -244,5 +244,5 @@ test> SELECT * FROM sf_account; - If anything goes wrong, you can navigate to the [CloudWatch](https://console.aws.amazon.com/cloudwatch/home) page on the AWS Management Console to get logs. - The steps in this document are based on [Building custom connectors using the Amazon AppFlow Custom Connector SDK](https://aws.amazon.com/blogs/compute/building-custom-connectors-using-the-amazon-appflow-custom-connector-sdk/). -- [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta) is **NOT** a production environment. +- [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless) is **NOT** a production environment. - To prevent excessive length, the examples in this document only show the `Insert` strategy, but `Update` and `Upsert` strategies are also tested and can be used. \ No newline at end of file diff --git a/develop/dev-guide-build-cluster-in-cloud.md b/develop/dev-guide-build-cluster-in-cloud.md index 3c5d7425c52dd..ec13e6d894b0e 100644 --- a/develop/dev-guide-build-cluster-in-cloud.md +++ b/develop/dev-guide-build-cluster-in-cloud.md @@ -45,7 +45,7 @@ This document walks you through the quickest way to get started with TiDB Cloud. > **Note:** > -> For [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta), when you connect to your cluster, you must include the prefix for your cluster in the user name and wrap the name with quotation marks. For more information, see [User name prefix](https://docs.pingcap.com/tidbcloud/select-cluster-tier#user-name-prefix). +> For [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless), when you connect to your cluster, you must include the prefix for your cluster in the user name and wrap the name with quotation marks. For more information, see [User name prefix](https://docs.pingcap.com/tidbcloud/select-cluster-tier#user-name-prefix). @@ -53,7 +53,7 @@ This document walks you through the quickest way to get started with TiDB Cloud. > **Note:** > -> For [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta), when you connect to your cluster, you must include the prefix for your cluster in the user name and wrap the name with quotation marks. For more information, see [User name prefix](/tidb-cloud/select-cluster-tier.md#user-name-prefix). +> For [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless), when you connect to your cluster, you must include the prefix for your cluster in the user name and wrap the name with quotation marks. For more information, see [User name prefix](/tidb-cloud/select-cluster-tier.md#user-name-prefix). diff --git a/information-schema/information-schema-resource-groups.md b/information-schema/information-schema-resource-groups.md index ca4ce6472a48e..5b6d025370f44 100644 --- a/information-schema/information-schema-resource-groups.md +++ b/information-schema/information-schema-resource-groups.md @@ -9,7 +9,7 @@ summary: Learn the `RESOURCE_GROUPS` information_schema table. > **Note:** > -> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). diff --git a/information-schema/information-schema-slow-query.md b/information-schema/information-schema-slow-query.md index 49f4ec8607c0a..67fb7d99e37e2 100644 --- a/information-schema/information-schema-slow-query.md +++ b/information-schema/information-schema-slow-query.md @@ -11,8 +11,8 @@ The `SLOW_QUERY` table provides the slow query information of the current node, > **Note:** > -> The `SLOW_QUERY` table is unavailable for [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). - +> The `SLOW_QUERY` table is unavailable for [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). + diff --git a/releases/release-7.0.0.md b/releases/release-7.0.0.md index 633de81032515..37f5d91f399f6 100644 --- a/releases/release-7.0.0.md +++ b/releases/release-7.0.0.md @@ -248,7 +248,7 @@ In v7.0.0-DMR, the key new features and improvements are as follows: * [DBeaver](https://dbeaver.io/) v23.0.1 supports TiDB by default [#17396](https://github.com/dbeaver/dbeaver/issues/17396) @[Icemap](https://github.com/Icemap) - Provides an independent TiDB module, icon, and logo. - - The default configuration supports [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta), making it easier to connect to TiDB Serverless. + - The default configuration supports [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless), making it easier to connect to TiDB Serverless. - Supports identifying TiDB versions to display or hide foreign key tabs. - Supports visualizing SQL execution plans in `EXPLAIN` results. - Supports highlighting TiDB keywords such as `PESSIMISTIC`, `OPTIMISTIC`, `AUTO_RANDOM`, `PLACEMENT`, `POLICY`, `REORGANIZE`, `EXCHANGE`, `CACHE`, `NONCLUSTERED`, and `CLUSTERED`. diff --git a/sql-statements/sql-statement-alter-resource-group.md b/sql-statements/sql-statement-alter-resource-group.md index 4beaf53b8d4f3..95164f877caf6 100644 --- a/sql-statements/sql-statement-alter-resource-group.md +++ b/sql-statements/sql-statement-alter-resource-group.md @@ -9,7 +9,7 @@ summary: Learn the usage of ALTER RESOURCE GROUP in TiDB. > **Note:** > -> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). diff --git a/sql-statements/sql-statement-calibrate-resource.md b/sql-statements/sql-statement-calibrate-resource.md index 8edf202d22aee..9f4bb05e44e87 100644 --- a/sql-statements/sql-statement-calibrate-resource.md +++ b/sql-statements/sql-statement-calibrate-resource.md @@ -11,7 +11,7 @@ The `CALIBRATE RESOURCE` statement is used to estimate and output the ['Request > **Note:** > -> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). @@ -34,7 +34,7 @@ To execute this command, make sure that the following requirements are met: - The user has `SUPER` or `RESOURCE_GROUP_ADMIN` privilege. - The user has the `SELECT` privilege for all tables in the `METRICS_SCHEMA` schema. -## Methods for estimating capacity +## Methods for estimating capacity TiDB provides two methods for estimation: diff --git a/sql-statements/sql-statement-create-resource-group.md b/sql-statements/sql-statement-create-resource-group.md index 6d80609a13ba9..eabd2588d8e04 100644 --- a/sql-statements/sql-statement-create-resource-group.md +++ b/sql-statements/sql-statement-create-resource-group.md @@ -9,7 +9,7 @@ summary: Learn the usage of CREATE RESOURCE GROUP in TiDB. > **Note:** > -> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). diff --git a/sql-statements/sql-statement-drop-resource-group.md b/sql-statements/sql-statement-drop-resource-group.md index f8c13bd761776..1fd1583452e60 100644 --- a/sql-statements/sql-statement-drop-resource-group.md +++ b/sql-statements/sql-statement-drop-resource-group.md @@ -9,7 +9,7 @@ summary: Learn the usage of DROP RESOURCE GROUP in TiDB. > **Note:** > -> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). diff --git a/sql-statements/sql-statement-flashback-to-timestamp.md b/sql-statements/sql-statement-flashback-to-timestamp.md index d6137bffe1c91..8165316870a35 100644 --- a/sql-statements/sql-statement-flashback-to-timestamp.md +++ b/sql-statements/sql-statement-flashback-to-timestamp.md @@ -11,7 +11,7 @@ TiDB v6.4.0 introduces the `FLASHBACK CLUSTER TO TIMESTAMP` syntax. You can use > **Warning:** > -> The `FLASHBACK CLUSTER TO TIMESTAMP` syntax is not applicable to [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta) clusters. Do not execute this statement on TiDB Serverless clusters to avoid unexpected results. +> The `FLASHBACK CLUSTER TO TIMESTAMP` syntax is not applicable to [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless) clusters. Do not execute this statement on TiDB Serverless clusters to avoid unexpected results. diff --git a/sql-statements/sql-statement-load-data.md b/sql-statements/sql-statement-load-data.md index 901ae7779032f..efe02fd5ea61e 100644 --- a/sql-statements/sql-statement-load-data.md +++ b/sql-statements/sql-statement-load-data.md @@ -21,7 +21,7 @@ In TiDB v7.0.0, the `LOAD DATA` SQL statement supports the following features: > **Note:** > -> This feature is only available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +> This feature is only available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). diff --git a/sql-statements/sql-statement-set-resource-group.md b/sql-statements/sql-statement-set-resource-group.md index 03cb62a2fc682..fb46594680b8f 100644 --- a/sql-statements/sql-statement-set-resource-group.md +++ b/sql-statements/sql-statement-set-resource-group.md @@ -11,7 +11,7 @@ summary: An overview of the usage of SET RESOURCE GROUP in the TiDB database. > **Note:** > -> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). diff --git a/sql-statements/sql-statement-show-create-resource-group.md b/sql-statements/sql-statement-show-create-resource-group.md index eb2d2e61284ea..2a982874ae28a 100644 --- a/sql-statements/sql-statement-show-create-resource-group.md +++ b/sql-statements/sql-statement-show-create-resource-group.md @@ -9,7 +9,7 @@ summary: Learn the usage of SHOW CREATE RESOURCE GROUP in TiDB. > **Note:** > -> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). diff --git a/statement-summary-tables.md b/statement-summary-tables.md index fbf98606f1c18..fb75efdafa18c 100644 --- a/statement-summary-tables.md +++ b/statement-summary-tables.md @@ -20,8 +20,8 @@ Therefore, starting from v4.0.0-rc.1, TiDB provides system tables in `informatio > **Note:** > -> The following tables are unavailable for [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta): `statements_summary`, `statements_summary_history`, `cluster_statements_summary`, and `cluster_statements_summary_history`. - +> The following tables are unavailable for [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless): `statements_summary`, `statements_summary_history`, `cluster_statements_summary`, and `cluster_statements_summary_history`. + This document details these tables and introduces how to use them to troubleshoot SQL performance issues. diff --git a/system-variables.md b/system-variables.md index b489f9bd7169d..c66dac7963937 100644 --- a/system-variables.md +++ b/system-variables.md @@ -1364,7 +1364,7 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1; > **Note:** > -> To improve the speed for index creation using this variable, make sure that your TiDB cluster is hosted on AWS and your TiDB node size is at least 8 vCPU. For [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta) clusters, this feature is unavailable. +> To improve the speed for index creation using this variable, make sure that your TiDB cluster is hosted on AWS and your TiDB node size is at least 8 vCPU. For [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless) clusters, this feature is unavailable. diff --git a/tidb-cloud/changefeed-sink-to-mysql.md b/tidb-cloud/changefeed-sink-to-mysql.md index 3afc0ffdcec7a..e8f7fa872bd2d 100644 --- a/tidb-cloud/changefeed-sink-to-mysql.md +++ b/tidb-cloud/changefeed-sink-to-mysql.md @@ -24,11 +24,11 @@ Make sure that your TiDB Cluster can connect to the MySQL service. If your MySQL service is in an AWS VPC that has no public internet access, take the following steps: 1. [Set up a VPC peering connection](/tidb-cloud/set-up-vpc-peering-connections.md) between the VPC of the MySQL service and your TiDB cluster. -2. Modify the inbound rules of the security group that the MySQL service is associated with. +2. Modify the inbound rules of the security group that the MySQL service is associated with. You must add [the CIDR of the region where your TiDB Cloud cluster is located](/tidb-cloud/set-up-vpc-peering-connections.md#prerequisite-set-a-project-cidr) to the inbound rules. Doing so allows the traffic to flow from your TiDB Cluster to the MySQL instance. -3. If the MySQL URL contains a hostname, you need to allow TiDB Cloud to be able to resolve the DNS hostname of the MySQL service. +3. If the MySQL URL contains a hostname, you need to allow TiDB Cloud to be able to resolve the DNS hostname of the MySQL service. 1. Follow the steps in [Enable DNS resolution for a VPC peering connection](https://docs.aws.amazon.com/vpc/latest/peering/modify-peering-connections.html#vpc-peering-dns). 2. Enable the **Accepter DNS resolution** option. @@ -36,10 +36,10 @@ If your MySQL service is in an AWS VPC that has no public internet access, take If your MySQL service is in a GCP VPC that has no public internet access, take the following steps: 1. If your MySQL service is Google Cloud SQL, you must expose a MySQL endpoint in the associated VPC of the Google Cloud SQL instance. You may need to use the [**Cloud SQL Auth proxy**](https://cloud.google.com/sql/docs/mysql/sql-proxy) which is developed by Google. -2. [Set up a VPC peering connection](/tidb-cloud/set-up-vpc-peering-connections.md) between the VPC of the MySQL service and your TiDB cluster. +2. [Set up a VPC peering connection](/tidb-cloud/set-up-vpc-peering-connections.md) between the VPC of the MySQL service and your TiDB cluster. 3. Modify the ingress firewall rules of the VPC where MySQL is located. - You must add [the CIDR of the region where your TiDB Cloud cluster is located](/tidb-cloud/set-up-vpc-peering-connections.md#prerequisite-set-a-project-cidr) to the ingress firewall rules. Doing so allows the traffic to flow from your TiDB Cluster to the MySQL endpoint. + You must add [the CIDR of the region where your TiDB Cloud cluster is located](/tidb-cloud/set-up-vpc-peering-connections.md#prerequisite-set-a-project-cidr) to the ingress firewall rules. Doing so allows the traffic to flow from your TiDB Cluster to the MySQL endpoint. ### Full load data @@ -70,7 +70,7 @@ The **Sink to MySQL** connector can only sink incremental data from your TiDB cl Log: tidb-binlog Pos: 420747102018863124 Finished dump at: 2020-11-10 10:40:20 - ``` + ``` ## Create a MySQL sink @@ -102,7 +102,7 @@ After completing the prerequisites, you can sink your data to MySQL. 7. Click **Next** to review the Changefeed configuration. If you confirm all configurations are correct, check the compliance of cross-region replication, and click **Create**. - + If you want to modify some configurations, click **Previous** to go back to the previous configuration page. 8. The sink starts soon, and you can see the status of the sink changes from "**Creating**" to "**Running**". diff --git a/tidb-cloud/tune-performance.md b/tidb-cloud/tune-performance.md index 425200239f58e..7861e2805e912 100644 --- a/tidb-cloud/tune-performance.md +++ b/tidb-cloud/tune-performance.md @@ -16,7 +16,7 @@ TiDB Cloud provides [Statement Analysis](#statement-analysis), [Slow Query](#slo > **Note:** > > Currently, these three features are unavailable for [Serverless Tier clusters](/tidb-cloud/select-cluster-tier.md#serverless-tier-beta). - + ## Statement Analysis To use the statement analysis, perform the following steps: @@ -37,7 +37,7 @@ For more information, see [Statement Execution Details in TiDB Dashboard](https: ## Slow Query -By default, SQL queries that take more than 300 milliseconds are considered as slow queries. +By default, SQL queries that take more than 300 milliseconds are considered as slow queries. To view slow queries in a cluster, perform the following steps: diff --git a/tidb-resource-control.md b/tidb-resource-control.md index 1aa9d067bfe8e..6ba8182632842 100644 --- a/tidb-resource-control.md +++ b/tidb-resource-control.md @@ -9,7 +9,7 @@ summary: Learn how to use the resource control feature to control and schedule a > **Note:** > -> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +> This feature is not available on [TiDB Serverless clusters](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). @@ -108,7 +108,7 @@ Starting from TiDB v7.0.0, both parameters are enabled by default. The results o | `resource-control.enabled` | `tidb_enable_resource_control`= ON | `tidb_enable_resource_control`= OFF | |:----------------------------|:-------------------------------------|:-------------------------------------| -| `resource-control.enabled`= true | Flow control and scheduling (recommended) | Invalid combination | +| `resource-control.enabled`= true | Flow control and scheduling (recommended) | Invalid combination | | `resource-control.enabled`= false | Only flow control (not recommended) | The feature is disabled. | For more information about the resource control mechanism and parameters, see [RFC: Global Resource Control in TiDB](https://github.com/pingcap/tidb/blob/master/docs/design/2022-11-25-global-resource-control.md). diff --git a/time-to-live.md b/time-to-live.md index 2451c906c1064..6754e999d19cd 100644 --- a/time-to-live.md +++ b/time-to-live.md @@ -252,7 +252,7 @@ Currently, the TTL feature has the following limitations: * A table with the TTL attribute does not support being referenced by other tables as the primary table in a foreign key constraint. * It is not guaranteed that all expired data is deleted immediately. The time when expired data is deleted depends on the scheduling interval and scheduling window of the background cleanup job. * For tables that use [clustered indexes](/clustered-indexes.md), if the primary key is neither an integer nor a binary string type, the TTL job cannot be split into multiple tasks. This will cause the TTL job to be executed sequentially on a single TiDB node. If the table contains a large amount of data, the execution of the TTL job might become slow. -* TTL is not available for [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta). +* TTL is not available for [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless). ## FAQs From 3b83a13f784ae96a3e26f6f8411ae881f6de049a Mon Sep 17 00:00:00 2001 From: Cheese Date: Wed, 5 Jul 2023 02:51:44 +0000 Subject: [PATCH 28/30] feat: change django support level to full cause of we added test cases to pipeline already (#14116) --- develop/dev-guide-third-party-support.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/develop/dev-guide-third-party-support.md b/develop/dev-guide-third-party-support.md index 34fe3121ef155..d49711e775091 100644 --- a/develop/dev-guide-third-party-support.md +++ b/develop/dev-guide-third-party-support.md @@ -251,8 +251,8 @@ If you encounter problems when connecting to TiDB using the tools listed in this Python Django - v4.0.5 - Compatible + v4.1 + Full django-tidb N/A From f655eaba231d502ad6bdb31182e9d691a42845b1 Mon Sep 17 00:00:00 2001 From: Aolin Date: Wed, 5 Jul 2023 11:35:14 +0800 Subject: [PATCH 29/30] ticdc: add Golang examples for checksum verification (#14090) --- TOC.md | 4 +- ticdc/ticdc-avro-checksum-verification.md | 514 ++++++++++++++++++++++ ticdc/ticdc-integrity-check.md | 14 +- 3 files changed, 524 insertions(+), 8 deletions(-) create mode 100644 ticdc/ticdc-avro-checksum-verification.md diff --git a/TOC.md b/TOC.md index 845109f7fdeba..bd5c44d7e53ca 100644 --- a/TOC.md +++ b/TOC.md @@ -553,7 +553,9 @@ - [TiCDC CSV Protocol](/ticdc/ticdc-csv.md) - [TiCDC Open API v2](/ticdc/ticdc-open-api-v2.md) - [TiCDC Open API v1](/ticdc/ticdc-open-api.md) - - [Guide for Developing a Storage Sink Consumer](/ticdc/ticdc-storage-consumer-dev-guide.md) + - TiCDC Data Consumption + - [TiCDC Row Data Checksum Verification Based on Avro](/ticdc/ticdc-avro-checksum-verification.md) + - [Guide for Developing a Storage Sink Consumer](/ticdc/ticdc-storage-consumer-dev-guide.md) - [Compatibility](/ticdc/ticdc-compatibility.md) - [Troubleshoot](/ticdc/troubleshoot-ticdc.md) - [FAQs](/ticdc/ticdc-faq.md) diff --git a/ticdc/ticdc-avro-checksum-verification.md b/ticdc/ticdc-avro-checksum-verification.md new file mode 100644 index 0000000000000..1cc9e89a290c8 --- /dev/null +++ b/ticdc/ticdc-avro-checksum-verification.md @@ -0,0 +1,514 @@ +--- +title: TiCDC Row Data Checksum Verification Based on Avro +summary: Introduce the detailed implementation of TiCDC row data checksum verification. +--- + +# TiCDC Row Data Checksum Verification Based on Avro + +This document introduces how to consume data sent to Kafka by TiCDC and encoded by Avro protocol using Golang, and how to perform data verification using the [Single-row data checksum feature](/ticdc/ticdc-integrity-check.md). + +The source code of this example is available in the [`avro-checksum-verification`](https://github.com/pingcap/tiflow/tree/master/examples/golang/avro-checksum-verification) directory. + +The example in this document uses [kafka-go](https://github.com/segmentio/kafka-go) to create a simple Kafka consumer program. This program continuously reads data from a specified topic, calculates the checksum, and verifies its value. + +```go +package main + +import ( + "context" + "encoding/binary" + "encoding/json" + "hash/crc32" + "io" + "math" + "net/http" + "strconv" + "strings" + + "github.com/linkedin/goavro/v2" + "github.com/pingcap/log" + "github.com/pingcap/tidb/parser/mysql" + "github.com/pingcap/tidb/types" + "github.com/pingcap/tiflow/pkg/errors" + "github.com/segmentio/kafka-go" + "go.uber.org/zap" +) + +const ( + // The first byte of the Confluent Avro wire format is always 0. + // For more details, see https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/index.html#wire-format. + magicByte = uint8(0) +) + +func main() { + var ( + kafkaAddr = "127.0.0.1:9092" + schemaRegistryURL = "http://127.0.0.1:8081" + + topic = "avro-checksum-test" + consumerGroupID = "avro-checksum-test" + ) + + consumer := kafka.NewReader(kafka.ReaderConfig{ + Brokers: []string{kafkaAddr}, + GroupID: consumerGroupID, + Topic: topic, + MaxBytes: 10e6, // 10MB + }) + defer consumer.Close() + + ctx := context.Background() + log.Info("start consuming ...", zap.String("kafka", kafkaAddr), zap.String("topic", topic), zap.String("groupID", consumerGroupID)) + for { + // 1. Fetch the kafka message. + message, err := consumer.FetchMessage(ctx) + if err != nil { + log.Error("read kafka message failed", zap.Error(err)) + } + + value := message.Value + if len(value) == 0 { + log.Info("delete event does not have value, skip checksum verification", zap.String("topic", topic)) + } + + // 2. Decode the value to get the corresponding value map and schema map. + valueMap, valueSchema, err := getValueMapAndSchema(value, schemaRegistryURL) + if err != nil { + log.Panic("decode kafka value failed", zap.String("topic", topic), zap.ByteString("value", value), zap.Error(err)) + } + + // 3. Calculate and verify checksum value using the value map and schema map obtained in the previous step. + err = CalculateAndVerifyChecksum(valueMap, valueSchema) + if err != nil { + log.Panic("calculate checksum failed", zap.String("topic", topic), zap.ByteString("value", value), zap.Error(err)) + } + + // 4. Commit offset after the data is successfully consumed. + if err := consumer.CommitMessages(ctx, message); err != nil { + log.Error("commit kafka message failed", zap.Error(err)) + break + } + } +} +``` + +The key steps for calculating the checksum value are `getValueMapAndSchema()` and `CalculateAndVerifyChecksum()`. The following sections describe the implementation of these two functions. + +## Decode data and get the corresponding schema + +The `getValueMapAndSchema()` method decodes data and gets the corresponding schema. This method returns both the data and schema as a `map[string]interface{}` type. + +```go +// data is the key or value of the received kafka message, and url is the schema registry url. +// This function returns the decoded value and corresponding schema as map. +func getValueMapAndSchema(data []byte, url string) (map[string]interface{}, map[string]interface{}, error) { + schemaID, binary, err := extractSchemaIDAndBinaryData(data) + if err != nil { + return nil, nil, err + } + + codec, err := GetSchema(url, schemaID) + if err != nil { + return nil, nil, err + } + + native, _, err := codec.NativeFromBinary(binary) + if err != nil { + return nil, nil, err + } + + result, ok := native.(map[string]interface{}) + if !ok { + return nil, nil, errors.New("raw avro message is not a map") + } + + schema := make(map[string]interface{}) + if err := json.Unmarshal([]byte(codec.Schema()), &schema); err != nil { + return nil, nil, errors.Trace(err) + } + + return result, schema, nil +} + +// extractSchemaIDAndBinaryData +func extractSchemaIDAndBinaryData(data []byte) (int, []byte, error) { + if len(data) < 5 { + return 0, nil, errors.ErrAvroInvalidMessage.FastGenByArgs() + } + if data[0] != magicByte { + return 0, nil, errors.ErrAvroInvalidMessage.FastGenByArgs() + } + return int(binary.BigEndian.Uint32(data[1:5])), data[5:], nil +} + +// GetSchema fetches the schema from the schema registry by the schema ID. +// This function returns a goavro.Codec that can be used to encode and decode the data. +func GetSchema(url string, schemaID int) (*goavro.Codec, error) { + requestURI := url + "/schemas/ids/" + strconv.Itoa(schemaID) + + req, err := http.NewRequest("GET", requestURI, nil) + if err != nil { + log.Error("Cannot create the request to look up the schema", zap.Error(err)) + return nil, errors.WrapError(errors.ErrAvroSchemaAPIError, err) + } + req.Header.Add( + "Accept", + "application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, "+ + "application/json", + ) + + httpClient := &http.Client{} + resp, err := httpClient.Do(req) + if err != nil { + return nil, err + } + defer resp.Body.Close() + + body, err := io.ReadAll(resp.Body) + if err != nil { + log.Error("Cannot parse the lookup schema response", zap.Error(err)) + return nil, errors.WrapError(errors.ErrAvroSchemaAPIError, err) + } + + if resp.StatusCode == 404 { + log.Warn("Specified schema not found in Registry", zap.String("requestURI", requestURI), zap.Int("schemaID", schemaID)) + return nil, errors.ErrAvroSchemaAPIError.GenWithStackByArgs("Schema not found in Registry") + } + + if resp.StatusCode != 200 { + log.Error("Failed to query schema from the Registry, HTTP error", + zap.Int("status", resp.StatusCode), zap.String("uri", requestURI), zap.ByteString("responseBody", body)) + return nil, errors.ErrAvroSchemaAPIError.GenWithStack("Failed to query schema from the Registry, HTTP error") + } + + var jsonResp lookupResponse + err = json.Unmarshal(body, &jsonResp) + if err != nil { + log.Error("Failed to parse result from Registry", zap.Error(err)) + return nil, errors.WrapError(errors.ErrAvroSchemaAPIError, err) + } + + codec, err := goavro.NewCodec(jsonResp.Schema) + if err != nil { + return nil, errors.WrapError(errors.ErrAvroSchemaAPIError, err) + } + return codec, nil +} + +type lookupResponse struct { + Name string `json:"name"` + SchemaID int `json:"id"` + Schema string `json:"schema"` +} + +``` + +## Calculate and verify the checksum value + +The `valueMap` and `valueSchema` obtained in the previous step contain all the elements used for checksum calculation and verification. + +The checksum calculation and verification process on the consumer side includes the following steps: + +1. Get the expected checksum value. +2. Iterate over each column, generate a byte slice according to the column value and the corresponding MySQL type, and update the checksum value continuously. +3. Compare the checksum value calculated in the previous step with the checksum value obtained from the received message. If they are not the same, the checksum verification fails and the data might be corrupted. + +The sample code is as follows: + +```go +func CalculateAndVerifyChecksum(valueMap, valueSchema map[string]interface{}) error { + // The fields variable stores the column type information for each data change event. The column IDs are used to sort the fields, which is the same as the order in which the checksum is calculated. + fields, ok := valueSchema["fields"].([]interface{}) + if !ok { + return errors.New("schema fields should be a map") + } + + // 1. Get the expected checksum value from valueMap, which is encoded as a string. + // If the expected checksum value is not found, it means that the checksum feature is not enabled when TiCDC sends the data. In this case, this function returns directly. + o, ok := valueMap["_tidb_row_level_checksum"] + if !ok { + return nil + } + expected := o.(string) + if expected == "" { + return nil + } + + // expectedChecksum is the expected checksum value passed from TiCDC. + expectedChecksum, err := strconv.ParseUint(expected, 10, 64) + if err != nil { + return errors.Trace(err) + } + + // 2. Iterate over each field and calculate the checksum value. + var actualChecksum uint32 + // buf stores the byte slice used to update the checksum value each time. + buf := make([]byte, 0) + for _, item := range fields { + field, ok := item.(map[string]interface{}) + if !ok { + return errors.New("schema field should be a map") + } + + // The tidbOp and subsequent columns are not involved in the checksum calculation, because they are used to assist data consumption and not real TiDB column data. + colName := field["name"].(string) + if colName == "_tidb_op" { + break + } + + // The holder variable stores the type information of each column. + var holder map[string]interface{} + switch ty := field["type"].(type) { + case []interface{}: + for _, item := range ty { + if m, ok := item.(map[string]interface{}); ok { + holder = m["connect.parameters"].(map[string]interface{}) + break + } + } + case map[string]interface{}: + holder = ty["connect.parameters"].(map[string]interface{}) + default: + log.Panic("type info is anything else", zap.Any("typeInfo", field["type"])) + } + tidbType := holder["tidb_type"].(string) + + mysqlType := mysqlTypeFromTiDBType(tidbType) + + // Get the value of each column from the decoded value map according to the name of each column. + value, ok := valueMap[colName] + if !ok { + return errors.New("value not found") + } + value, err := getColumnValue(value, holder, mysqlType) + if err != nil { + return errors.Trace(err) + } + + if len(buf) > 0 { + buf = buf[:0] + } + + // Generate a byte slice used to update the checksum according to the value and mysqlType of each column, and then update the checksum value. + buf, err = buildChecksumBytes(buf, value, mysqlType) + if err != nil { + return errors.Trace(err) + } + actualChecksum = crc32.Update(actualChecksum, crc32.IEEETable, buf) + } + + if uint64(actualChecksum) != expectedChecksum { + log.Error("checksum mismatch", + zap.Uint64("expected", expectedChecksum), + zap.Uint64("actual", uint64(actualChecksum))) + return errors.New("checksum mismatch") + } + + log.Info("checksum verified", zap.Uint64("checksum", uint64(actualChecksum))) + return nil +} + +func mysqlTypeFromTiDBType(tidbType string) byte { + var result byte + switch tidbType { + case "INT", "INT UNSIGNED": + result = mysql.TypeLong + case "BIGINT", "BIGINT UNSIGNED": + result = mysql.TypeLonglong + case "FLOAT": + result = mysql.TypeFloat + case "DOUBLE": + result = mysql.TypeDouble + case "BIT": + result = mysql.TypeBit + case "DECIMAL": + result = mysql.TypeNewDecimal + case "TEXT": + result = mysql.TypeVarchar + case "BLOB": + result = mysql.TypeLongBlob + case "ENUM": + result = mysql.TypeEnum + case "SET": + result = mysql.TypeSet + case "JSON": + result = mysql.TypeJSON + case "DATE": + result = mysql.TypeDate + case "DATETIME": + result = mysql.TypeDatetime + case "TIMESTAMP": + result = mysql.TypeTimestamp + case "TIME": + result = mysql.TypeDuration + case "YEAR": + result = mysql.TypeYear + default: + log.Panic("this should not happen, unknown TiDB type", zap.String("type", tidbType)) + } + return result +} + +// The value is an interface type, which needs to be converted according to the type information provided by the holder. +func getColumnValue(value interface{}, holder map[string]interface{}, mysqlType byte) (interface{}, error) { + switch t := value.(type) { + // The column with nullable is encoded as a map, and there is only one key-value pair. The key is the type, and the value is the real value. Only the real value is concerned here. + case map[string]interface{}: + for _, v := range t { + value = v + } + } + + switch mysqlType { + case mysql.TypeEnum: + // Enum is encoded as a string, which is converted to the int value corresponding to the Enum definition here. + allowed := strings.Split(holder["allowed"].(string), ",") + switch t := value.(type) { + case string: + enum, err := types.ParseEnum(allowed, t, "") + if err != nil { + return nil, errors.Trace(err) + } + value = enum.Value + case nil: + value = nil + } + case mysql.TypeSet: + // Set is encoded as a string, which is converted to the int value corresponding to the Set definition here. + elems := strings.Split(holder["allowed"].(string), ",") + switch t := value.(type) { + case string: + s, err := types.ParseSet(elems, t, "") + if err != nil { + return nil, errors.Trace(err) + } + value = s.Value + case nil: + value = nil + } + } + return value, nil +} + +// buildChecksumBytes generates a byte slice used to update the checksum, refer to https://github.com/pingcap/tidb/blob/e3417913f58cdd5a136259b902bf177eaf3aa637/util/rowcodec/common.go#L308 +func buildChecksumBytes(buf []byte, value interface{}, mysqlType byte) ([]byte, error) { + if value == nil { + return buf, nil + } + + switch mysqlType { + // TypeTiny, TypeShort, and TypeInt32 are encoded as int32. + // TypeLong is encoded as int32 if signed, otherwise, it is encoded as int64. + // TypeLongLong is encoded as int64 if signed, otherwise, it is encoded as uint64. + // When the checksum feature is enabled, bigintUnsignedHandlingMode must be set to string, which is encoded as string. + case mysql.TypeTiny, mysql.TypeShort, mysql.TypeLong, mysql.TypeLonglong, mysql.TypeInt24, mysql.TypeYear: + switch a := value.(type) { + case int32: + buf = binary.LittleEndian.AppendUint64(buf, uint64(a)) + case uint32: + buf = binary.LittleEndian.AppendUint64(buf, uint64(a)) + case int64: + buf = binary.LittleEndian.AppendUint64(buf, uint64(a)) + case uint64: + buf = binary.LittleEndian.AppendUint64(buf, a) + case string: + v, err := strconv.ParseUint(a, 10, 64) + if err != nil { + return nil, errors.Trace(err) + } + buf = binary.LittleEndian.AppendUint64(buf, v) + default: + log.Panic("unknown golang type for the integral value", + zap.Any("value", value), zap.Any("mysqlType", mysqlType)) + } + // Encode float type as float64 and encode double type as float64. + case mysql.TypeFloat, mysql.TypeDouble: + var v float64 + switch a := value.(type) { + case float32: + v = float64(a) + case float64: + v = a + } + if math.IsInf(v, 0) || math.IsNaN(v) { + v = 0 + } + buf = binary.LittleEndian.AppendUint64(buf, math.Float64bits(v)) + // getColumnValue encodes Enum and Set to uint64 type. + case mysql.TypeEnum, mysql.TypeSet: + buf = binary.LittleEndian.AppendUint64(buf, value.(uint64)) + case mysql.TypeBit: + // Encode bit type as []byte and convert it to uint64. + v, err := binaryLiteralToInt(value.([]byte)) + if err != nil { + return nil, errors.Trace(err) + } + buf = binary.LittleEndian.AppendUint64(buf, v) + // Non-binary types are encoded as string, and binary types are encoded as []byte. + case mysql.TypeVarchar, mysql.TypeVarString, mysql.TypeString, mysql.TypeTinyBlob, mysql.TypeMediumBlob, mysql.TypeLongBlob, mysql.TypeBlob: + switch a := value.(type) { + case string: + buf = appendLengthValue(buf, []byte(a)) + case []byte: + buf = appendLengthValue(buf, a) + default: + log.Panic("unknown golang type for the string value", + zap.Any("value", value), zap.Any("mysqlType", mysqlType)) + } + case mysql.TypeTimestamp, mysql.TypeDatetime, mysql.TypeDate, mysql.TypeDuration, mysql.TypeNewDate: + v := value.(string) + buf = appendLengthValue(buf, []byte(v)) + // When the checksum feature is enabled, decimalHandlingMode must be set to string. + case mysql.TypeNewDecimal: + buf = appendLengthValue(buf, []byte(value.(string))) + case mysql.TypeJSON: + buf = appendLengthValue(buf, []byte(value.(string))) + // Null and Geometry are not involved in the checksum calculation. + case mysql.TypeNull, mysql.TypeGeometry: + // do nothing + default: + return buf, errors.New("invalid type for the checksum calculation") + } + return buf, nil +} + +func appendLengthValue(buf []byte, val []byte) []byte { + buf = binary.LittleEndian.AppendUint32(buf, uint32(len(val))) + buf = append(buf, val...) + return buf +} + +// Convert []byte to uint64, refer to https://github.com/pingcap/tidb/blob/e3417913f58cdd5a136259b902bf177eaf3aa637/types/binary_literal.go#L105 +func binaryLiteralToInt(bytes []byte) (uint64, error) { + bytes = trimLeadingZeroBytes(bytes) + length := len(bytes) + + if length > 8 { + log.Error("invalid bit value found", zap.ByteString("value", bytes)) + return math.MaxUint64, errors.New("invalid bit value") + } + + if length == 0 { + return 0, nil + } + + val := uint64(bytes[0]) + for i := 1; i < length; i++ { + val = (val << 8) | uint64(bytes[i]) + } + return val, nil +} + +func trimLeadingZeroBytes(bytes []byte) []byte { + if len(bytes) == 0 { + return bytes + } + pos, posMax := 0, len(bytes)-1 + for ; pos < posMax; pos++ { + if bytes[pos] != 0 { + break + } + } + return bytes[pos:] +} +``` diff --git a/ticdc/ticdc-integrity-check.md b/ticdc/ticdc-integrity-check.md index 1142d93548b4b..36a271dbc942a 100644 --- a/ticdc/ticdc-integrity-check.md +++ b/ticdc/ticdc-integrity-check.md @@ -90,15 +90,15 @@ fn checksum(columns) { * BIT, ENUM, and SET types are converted to UINT64. * BIT type is converted to UINT64 in binary format. - * ENUM and SET types are converted to their corresponding INT values in UINT64. For example, if the data value of a `SET('a','b','c')` type column is `'a,c'`, the value is encoded as `0b101`. + * ENUM and SET types are converted to their corresponding INT values in UINT64. For example, if the data value of a `SET('a','b','c')` type column is `'a,c'`, the value is encoded as `0b101`, which is `5` in decimal. - * TIMESTAMP, DATE, DURATION, DATETIME, JSON, and DECIMAL types are converted to STRING and then encoded as UTF8 bytes. - * VARBIANRY, BINARY, and BLOB types (including TINY, MEDIUM, and LONG) are directly encoded as bytes. - * VARCHAR, CHAR, and TEXT types (including TINY, MEDIUM, and LONG) are encoded as UTF8 bytes. + * TIMESTAMP, DATE, DURATION, DATETIME, JSON, and DECIMAL types are first converted to STRING and then converted to bytes. + * CHAR, VARCHAR, VARSTRING, STRING, TEXT, and BLOB types (including TINY, MEDIUM, and LONG) are directly converted to bytes. * NULL and GEOMETRY types are excluded from the checksum calculation and this function returns empty bytes. +For more information about the implementation of data consumption and checksum verification using Golang, see [TiCDC row data checksum verification](/ticdc/ticdc-avro-checksum-verification.md). + > **Note:** > -> After enabling the checksum validation feature, DECIMAL and UNSIGNED BIGINT types data will be converted to string types. Therefore, in the downstream consumer code, you need to convert them back to their corresponding numerical types before calculating checksum values. - -The consumer code written in Golang implements steps such as decoding data read from Kafka, sorting by schema fields, and calculating the checksum value. For more information, see [`avro/decoder.go`](https://github.com/pingcap/tiflow/blob/master/pkg/sink/codec/avro/decoder.go). +> - After enabling the checksum validation feature, DECIMAL and UNSIGNED BIGINT types data will be converted to STRING types. Therefore, in the downstream consumer code, you need to convert them back to their corresponding numerical types before calculating checksum values. +> - The checksum verification process does not include DELETE events. This is because DELETE events only contain the handle key column, while the checksum is calculated based on all columns. From fd88afecf3627f8def9e8ac186c61e41173e2385 Mon Sep 17 00:00:00 2001 From: Aolin Date: Wed, 5 Jul 2023 14:26:14 +0800 Subject: [PATCH 30/30] lightning: add a note for physical import mode (#14108) --- migrate-large-mysql-to-tidb.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/migrate-large-mysql-to-tidb.md b/migrate-large-mysql-to-tidb.md index 972baabd29be2..cb21154d79649 100644 --- a/migrate-large-mysql-to-tidb.md +++ b/migrate-large-mysql-to-tidb.md @@ -7,10 +7,7 @@ summary: Learn how to migrate MySQL of large datasets to TiDB. When the data volume to be migrated is small, you can easily [use DM to migrate data](/migrate-small-mysql-to-tidb.md), both for full migration and incremental replication. However, because DM imports data at a slow speed (30~50 GiB/h), when the data volume is large, the migration might take a long time. "Large datasets" in this document usually mean data around one TiB or more. -This document describes how to migrate large datasets from MySQL to TiDB. The whole migration has two processes: - -1. *Full migration*. Use Dumpling and TiDB Lightning to perform the full migration. TiDB Lightning's **local backend** mode can import data at a speed of up to 500 GiB/h. -2. *Incremental replication*. After the full migration is completed, you can replicate the incremental data using DM. +This document describes how to perform the full migration using Dumpling and TiDB Lightning. TiDB Lightning [Physical Import Mode](/tidb-lightning/tidb-lightning-physical-import-mode.md) can import data at a speed of up to 500 GiB/h. Note that this speed is affected by various factors such as hardware configuration, table schema, and the number of indexes. After the full migration is completed, you can replicate the incremental data using DM. ## Prerequisites