From d68d9b3e3cfb40ba54f163a0aae45f2a912589f8 Mon Sep 17 00:00:00 2001 From: Steve Boyd Date: Wed, 25 Sep 2024 13:19:26 +1200 Subject: [PATCH] DOC DB read-only replicas --- .../07_DB_read_only_replicas.md | 21 +++++++++++++++++++ en/08_Changelogs/5.4.0.md | 18 ++++++++++++++++ 2 files changed, 39 insertions(+) create mode 100644 en/02_Developer_Guides/08_Performance/07_DB_read_only_replicas.md diff --git a/en/02_Developer_Guides/08_Performance/07_DB_read_only_replicas.md b/en/02_Developer_Guides/08_Performance/07_DB_read_only_replicas.md new file mode 100644 index 00000000..23372722 --- /dev/null +++ b/en/02_Developer_Guides/08_Performance/07_DB_read_only_replicas.md @@ -0,0 +1,21 @@ +--- +title: DB read-only replicas +summary: Using DB read-only replicas to improve performance +--- + +# DB read-only replicas + +Read-only replicas are additional databases that are used to offload read queries from the primary database, which can improve performance by reducing the load on the primary database. + +Read-only replicas are configured by adding environment variables that match the primary environment variable and suffixing `_REPLICA_` to the variable name, where `` is the replica number padding by a zero if it's less than 10, for example `SS_DATABASE_SERVER` becomes `SS_DATABASE_SERVER_REPLICA_01` for the first replica, or `SS_DATABASE_SERVER_REPLICA_12` for the 12th replica. Replias must be numbered sequentially starting from `01`. + +Replicas cannot define different configuration values for `SS_DATABASE_CLASS`, `SS_DATABASE_NAME`, or `SS_DATABASE_CHOOSE_NAME`. They are restricted to prevent strange issues that could arise from having inconsistent database configurations across replicas. + +If one or more read-only replicas have been configured, then one of the read-only replicas will be selected from the pool of available replicas to handle queries for the rest of the request cycle. However the primary database will be used instead if one of the follow criteria has been met: + +- The current query includes any mutable SQL such as `INSERT` or `DELETE`. The primary database will be used for the current query, as well as any future queries, including read queries, for the rest of the current request cycle. Mutable SQL is defined on [`DBConnector::isQueryMutable()`](api:SilverStripe\ORM\Connect\DBConnector::isQueryMutable()). +- The HTTP request that matched a rule defined in [`Director.rule_patterns_must_use_primary_db`](api:SilverStripe\Control\Director->rule_patterns_must_use_primary_db). By default the URL paths `Security`, `dev`, and `admin` (if `silverstripe/admin` is installed) are covered by this by default. +- A user with CMS access is logged in. This is done to ensure that logged in users will correctly see any CMS updates on the website frontend. Users without CMS access will still use the read-only replica. +- The DataObject being queried is configured with [`DataObject.must_use_primary_db`](api:SilverStripe\ORM\DataObject->must_use_primary_db) to `true`, assuming the use of an ORM method that later calls DataQuery::execute(). This includes most commonly used ORM methods such as [`DataObject::get()`](api:SilverStripe\ORM\DataObject::get()), and excludes [`SQLSelect`](api:SilverStripe\ORM\Queries\SQLSelect) methods. By default all core security related DataObjects have `must_use_primary_db` set to `true`. +- Anything code wrapped in a call to [`DB::withPrimary()`](api:SilverStripe\ORM\DB::withPrimary()). +- CLI is being used instead of HTTP. diff --git a/en/08_Changelogs/5.4.0.md b/en/08_Changelogs/5.4.0.md index f4a2cd53..09a90bc7 100644 --- a/en/08_Changelogs/5.4.0.md +++ b/en/08_Changelogs/5.4.0.md @@ -7,6 +7,7 @@ title: 5.4.0 (unreleased) ## Overview - [Features and enhancements](#features-and-enhancements) + - [Read-only replica database support](#db-read-only-replicas) - [Option to change `ClassName` column from enum to varchar](#classname-varchar) - [Other new features](#other-new-features) - [API changes](#api-changes) @@ -14,6 +15,23 @@ title: 5.4.0 (unreleased) ## Features and enhancements +### Read-only replica database support {#db-read-only-replicas} + +Read-only replicas are additional databases that are used to offload read queries from the primary database, which can improve performance by reducing the load on the primary database. + +Read-only replicas are configured by adding environment variables that match the primary environment variable and suffixing `_REPLICA_` to the variable name, where `` is the replica number padding by a zero if it's less than 10, for example `SS_DATABASE_SERVER` becomes `SS_DATABASE_SERVER_REPLICA_01` for the first replica, or `SS_DATABASE_SERVER_REPLICA_12` for the 12th replica. Replias must be numbered sequentially starting from `01`. + +Replicas cannot define different configuration values for `SS_DATABASE_CLASS`, `SS_DATABASE_NAME`, or `SS_DATABASE_CHOOSE_NAME`. They are restricted to prevent strange issues that could arise from having inconsistent database configurations across replicas. + +If one or more read-only replicas have been configured, then one of the read-only replicas will be selected from the pool of available replicas to handle queries for the rest of the request cycle. However the primary database will be used instead if one of the follow criteria has been met: + +- The current query includes any mutable SQL such as `INSERT` or `DELETE`. The primary database will be used for the current query, as well as any future queries, including read queries, for the rest of the current request cycle. Mutable SQL is defined on [`DBConnector::isQueryMutable()`](api:SilverStripe\ORM\Connect\DBConnector::isQueryMutable()). +- The HTTP request that matched a rule defined in [`Director.rule_patterns_must_use_primary_db`](api:SilverStripe\Control\Director->rule_patterns_must_use_primary_db). By default the URL paths `Security`, `dev`, and `admin` (if `silverstripe/admin` is installed) are covered by this by default. +- A user with CMS access is logged in. This is done to ensure that logged in users will correctly see any CMS updates on the website frontend. Users without CMS access will still use the read-only replica. +- The DataObject being queried is configured with [`DataObject.must_use_primary_db`](api:SilverStripe\ORM\DataObject->must_use_primary_db) to `true`, assuming the use of an ORM method that later calls DataQuery::execute(). This includes most commonly used ORM methods such as [`DataObject::get()`](api:SilverStripe\ORM\DataObject::get()), and excludes [`SQLSelect`](api:SilverStripe\ORM\Queries\SQLSelect) methods. By default all core security related DataObjects have `must_use_primary_db` set to `true`. +- Anything code wrapped in a call to [`DB::withPrimary()`](api:SilverStripe\ORM\DB::withPrimary()). +- CLI is being used instead of HTTP. + ### Option to change `ClassName` column from enum to varchar {#classname-varchar} On websites with very large database tables it can take a long time to run `dev/build`, which can be a problem when deploying changes to production. This is because the `ClassName` column is an `enum` type which requires an a `ALTER TABLE` query to be run affecting every row whenever there is a new valid value for the column. For a very rough benchmark, running an `ALTER TABLE` query on a database table of 10 million records took 28.52 seconds on a mid-range 2023 laptop, though this time will vary depending on the database and hardware being used.