Skip to content

Commit

Permalink
Increase Timeout (#377)
Browse files Browse the repository at this point in the history
- Duruing partner cloud upgrade it was detected that COU was failing to
check if the whole model is active idle because the default time is not
enough.

- Keystone also needs some extra time, and this change from 30 to 40
minutes.
  • Loading branch information
gabrielcocenza authored Apr 18, 2024
1 parent fe2d671 commit 8f1300c
Show file tree
Hide file tree
Showing 14 changed files with 50 additions and 50 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Commands:
- `COU_MODEL_RETRIES` - define how many times to retry the connection to Juju model before giving up. Default value is 5 times.
- `COU_MODEL_RETRY_BACKOFF` - define number of seconds to increase the wait between connection to the Juju model retry attempts. Default value is 2 seconds.
- `COU_STANDARD_IDLE_TIMEOUT` - how long COU will wait for an application to settle to active/idle and declare the upgrade complete. The default value is 300 seconds.
- `COU_LONG_IDLE_TIMEOUT` - a longer version of COU_STANDARD_IDLE_TIMEOUT for applications that are known to need more time than usual to upgrade like such as Keystone and Octavia. The default value is 1800 seconds.
- `COU_LONG_IDLE_TIMEOUT` - a longer version of COU_STANDARD_IDLE_TIMEOUT for applications that are known to need more time than usual to upgrade like such as Keystone and Octavia. The default value is 2400 seconds.

## Supported Upgrade Paths

Expand Down
2 changes: 1 addition & 1 deletion cou/apps/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
STANDARD_IDLE_TIMEOUT: int = int(
os.environ.get("COU_STANDARD_IDLE_TIMEOUT", 5 * 60)
) # default of 5 min
LONG_IDLE_TIMEOUT: int = int(os.environ.get("COU_LONG_IDLE_TIMEOUT", 30 * 60)) # default of 30 min
LONG_IDLE_TIMEOUT: int = int(os.environ.get("COU_LONG_IDLE_TIMEOUT", 40 * 60)) # default of 40 min
ORIGIN_SETTINGS = ("openstack-origin", "source")
REQUIRED_SETTINGS = ("enable-auto-restarts", "action-managed-upgrade", *ORIGIN_SETTINGS)
LATEST_STABLE = "latest/stable"
Expand Down
4 changes: 2 additions & 2 deletions cou/steps/plan.py
Original file line number Diff line number Diff line change
Expand Up @@ -369,9 +369,9 @@ def _get_pre_upgrade_steps(analysis_result: Analysis, args: CLIargs) -> list[Pre
coro=analysis_result.model.wait_for_active_idle(
# NOTE (rgildein): We need to DEFAULT_TIMEOUT so it's possible to change if
# a network is too slow, this could cause an issue.
# We are using max function to ensure timeout is always at least 11 (1 second
# We are using max function to ensure timeout is always at least 120 (110 seconds
# higher than the idle_period to prevent false negative).
timeout=max(DEFAULT_TIMEOUT + 1, 11),
timeout=max(DEFAULT_TIMEOUT, 120),
idle_period=10,
raise_on_blocked=True,
),
Expand Down
2 changes: 1 addition & 1 deletion docs/how-to/interruption.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ Usage example:
Upgrade software packages on unit 'keystone/2'
Upgrade 'keystone' to the new channel: 'victoria/stable'
Change charm config of 'keystone' 'openstack-origin' to 'cloud:focal-victoria'
Wait for up to 1800s for model 'test-model' to reach the idle state
Wait for up to 2400s for model 'test-model' to reach the idle state
Verify that the workload of 'keystone' has been upgraded

Would you like to start the upgrade? Continue (y/N): n
Expand Down
2 changes: 1 addition & 1 deletion docs/how-to/no-backup.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Upgrade:
Upgrade software packages on unit 'rabbitmq-server/2'
Upgrade 'rabbitmq-server' to the new channel: '3.9/stable'
Change charm config of 'rabbitmq-server' 'source' to 'cloud:focal-victoria'
Wait for up to 1800s for model 'test-model' to reach the idle state
Wait for up to 2400s for model 'test-model' to reach the idle state
Verify that the workload of 'rabbitmq-server' has been upgraded

Continue (y/n): y
Expand Down
4 changes: 2 additions & 2 deletions docs/how-to/plan-upgrade.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Output example
Change charm config of 'keystone' 'action-managed-upgrade' to 'False'
Upgrade 'keystone' to the new channel: 'victoria/stable'
Change charm config of 'keystone' 'openstack-origin' to 'cloud:focal-victoria'
Wait for up to 1800s for model 'test_model' to reach the idle state
Wait for up to 2400s for model 'test_model' to reach the idle state
Verify that the workload of 'keystone' has been upgraded on units: keystone/0
Control Plane subordinate(s) upgrade plan
Upgrade plan for 'keystone-ldap' to 'victoria'
Expand All @@ -59,7 +59,7 @@ Output example
├── Upgrade the unit: 'nova-compute/0'
├── Resume the unit: 'nova-compute/0'
Enable nova-compute scheduler from unit: 'nova-compute/0'
Wait for up to 1800s for model 'test_model' to reach the idle state
Wait for up to 2400s for model 'test_model' to reach the idle state
Verify that the workload of 'nova-compute' has been upgraded on units: nova-compute/0
Remaining Data Plane principal(s) upgrade plan
Upgrade plan for 'ceph-osd' to 'victoria'
Expand Down
6 changes: 3 additions & 3 deletions docs/how-to/upgrade-cloud.rst
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ Usage example
Upgrade software packages on unit 'rabbitmq-server/2'
Upgrade 'rabbitmq-server' to the new channel: '3.9/stable'
Change charm config of 'rabbitmq-server' 'source' to 'cloud:focal-victoria'
Wait for up to 1800s for model 'test-model' to reach the idle state
Wait for up to 2400s for model 'test-model' to reach the idle state
Verify that the workload of 'rabbitmq-server' has been upgraded
...
Would you like to start the upgrade? Continue (y/N): y
Expand All @@ -145,7 +145,7 @@ Usage example
Upgrade software packages on unit 'rabbitmq-server/2'
Upgrade 'rabbitmq-server' to the new channel: '3.9/stable'
Change charm config of 'rabbitmq-server' 'source' to 'cloud:focal-victoria'
Wait for up to 1800s for model 'test-model' to reach the idle state
Wait for up to 2400s for model 'test-model' to reach the idle state
Verify that the workload of 'rabbitmq-server' has been upgraded

Continue (y/n): y
Expand All @@ -158,7 +158,7 @@ Usage example
Upgrade software packages on unit 'keystone/2'
Upgrade 'keystone' to the new channel: 'victoria/stable'
Change charm config of 'keystone' 'openstack-origin' to 'cloud:focal-victoria'
Wait for up to 1800s for model 'test-model' to reach the idle state
Wait for up to 2400s for model 'test-model' to reach the idle state
Verify that the workload of 'keystone' has been upgraded

Continue (y/n): y
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/environment-variables.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ Environment Variables
to **active/idle** and declare the upgrade complete. The default value is 300 seconds.
* **COU_LONG_IDLE_TIMEOUT** - a longer version of **COU_STANDARD_IDLE_TIMEOUT** for applications
that are known to need more time than usual to upgrade, such as Keystone and Octavia. The
default value is 1800 seconds.
default value is 2400 seconds.
2 changes: 1 addition & 1 deletion tests/functional/tests/smoke.py
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ def generate_expected_plan(self, backup: bool = True) -> str:
"\t\t\t\tUpgrade software packages on unit 'mysql-innodb-cluster/2'\n"
"\t\t\tChange charm config of 'mysql-innodb-cluster' 'source' to "
"'cloud:focal-victoria'\n"
"\t\t\tWait for up to 1800s for app 'mysql-innodb-cluster' to reach the idle state\n"
"\t\t\tWait for up to 2400s for app 'mysql-innodb-cluster' to reach the idle state\n"
"\t\t\tVerify that the workload of 'mysql-innodb-cluster' has been upgraded on units: "
"mysql-innodb-cluster/0, mysql-innodb-cluster/1, mysql-innodb-cluster/2\n"
)
Expand Down
4 changes: 2 additions & 2 deletions tests/mocked_plans/sample_plans/base.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ plan: |
Change charm config of 'keystone' 'action-managed-upgrade' to 'False'
Upgrade 'keystone' to the new channel: 'victoria/stable'
Change charm config of 'keystone' 'openstack-origin' to 'cloud:focal-victoria'
Wait for up to 1800s for model 'base' to reach the idle state
Wait for up to 2400s for model 'base' to reach the idle state
Verify that the workload of 'keystone' has been upgraded on units: keystone/0
Control Plane subordinate(s) upgrade plan
Upgrade plan for 'keystone-ldap' to 'victoria'
Expand All @@ -32,7 +32,7 @@ plan: |
├── Upgrade the unit: 'nova-compute/0'
├── Resume the unit: 'nova-compute/0'
Enable nova-compute scheduler from unit: 'nova-compute/0'
Wait for up to 1800s for model 'base' to reach the idle state
Wait for up to 2400s for model 'base' to reach the idle state
Verify that the workload of 'nova-compute' has been upgraded on units: nova-compute/0
Remaining Data Plane principal(s) upgrade plan
Upgrade plan for 'ceph-osd' to 'victoria'
Expand Down
24 changes: 12 additions & 12 deletions tests/unit/apps/test_auxiliary.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,9 +168,9 @@ def test_auxiliary_upgrade_plan_ussuri_to_victoria_change_channel(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -238,9 +238,9 @@ def test_auxiliary_upgrade_plan_ussuri_to_victoria(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -314,9 +314,9 @@ def test_auxiliary_upgrade_plan_ussuri_to_victoria_ch_migration(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -694,9 +694,9 @@ def test_ceph_mon_upgrade_plan_xena_to_yoga(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -772,9 +772,9 @@ def test_ceph_mon_upgrade_plan_ussuri_to_victoria(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -1119,9 +1119,9 @@ def test_mysql_innodb_cluster_upgrade(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for app '{app.name}' to reach the idle state",
description=f"Wait for up to 2400s for app '{app.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=[app.name]),
coro=model.wait_for_active_idle(2400, apps=[app.name]),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down
24 changes: 12 additions & 12 deletions tests/unit/apps/test_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -230,9 +230,9 @@ def test_upgrade_plan_ussuri_to_victoria(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -314,9 +314,9 @@ def test_upgrade_plan_ussuri_to_victoria_ch_migration(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -392,9 +392,9 @@ def test_upgrade_plan_channel_on_next_os_release(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -471,9 +471,9 @@ def test_upgrade_plan_origin_already_on_next_openstack_release(model):
coro=model.upgrade_charm(app.name, "victoria/stable"),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -588,9 +588,9 @@ def test_upgrade_plan_application_already_disable_action_managed(model):
),
),
PostUpgradeStep(
description=f"Wait for up to 1800s for model '{model.name}' to reach the idle state",
description=f"Wait for up to 2400s for model '{model.name}' to reach the idle state",
parallel=False,
coro=model.wait_for_active_idle(1800, apps=None),
coro=model.wait_for_active_idle(2400, apps=None),
),
PostUpgradeStep(
description=f"Verify that the workload of '{app.name}' has been upgraded on units: "
Expand Down Expand Up @@ -809,7 +809,7 @@ def test_nova_compute_upgrade_plan(model):
Enable nova-compute scheduler from unit: 'nova-compute/0'
Enable nova-compute scheduler from unit: 'nova-compute/1'
Enable nova-compute scheduler from unit: 'nova-compute/2'
Wait for up to 1800s for model 'test_model' to reach the idle state
Wait for up to 2400s for model 'test_model' to reach the idle state
Verify that the workload of 'nova-compute' has been upgraded on units: nova-compute/0, nova-compute/1, nova-compute/2
""" # noqa: E501 line too long
)
Expand Down Expand Up @@ -862,7 +862,7 @@ def test_nova_compute_upgrade_plan_single_unit(model):
├── Upgrade the unit: 'nova-compute/0'
├── Resume the unit: 'nova-compute/0'
Enable nova-compute scheduler from unit: 'nova-compute/0'
Wait for up to 1800s for model 'test_model' to reach the idle state
Wait for up to 2400s for model 'test_model' to reach the idle state
Verify that the workload of 'nova-compute' has been upgraded on units: nova-compute/0
"""
)
Expand Down
10 changes: 5 additions & 5 deletions tests/unit/steps/test_hypervisor.py
Original file line number Diff line number Diff line change
Expand Up @@ -420,7 +420,7 @@ def test_hypervisor_upgrade_plan(model):
Wait for up to 300s for app 'cinder' to reach the idle state
Verify that the workload of 'cinder' has been upgraded on units: cinder/0
Enable nova-compute scheduler from unit: 'nova-compute/0'
Wait for up to 1800s for model 'test_model' to reach the idle state
Wait for up to 2400s for model 'test_model' to reach the idle state
Verify that the workload of 'nova-compute' has been upgraded on units: nova-compute/0
Upgrade plan for 'az-1' to 'victoria'
Disable nova-compute scheduler from unit: 'nova-compute/1'
Expand All @@ -437,7 +437,7 @@ def test_hypervisor_upgrade_plan(model):
├── Upgrade the unit: 'nova-compute/1'
├── Resume the unit: 'nova-compute/1'
Enable nova-compute scheduler from unit: 'nova-compute/1'
Wait for up to 1800s for model 'test_model' to reach the idle state
Wait for up to 2400s for model 'test_model' to reach the idle state
Verify that the workload of 'nova-compute' has been upgraded on units: nova-compute/1
Upgrade plan for 'az-2' to 'victoria'
Disable nova-compute scheduler from unit: 'nova-compute/2'
Expand All @@ -454,7 +454,7 @@ def test_hypervisor_upgrade_plan(model):
├── Upgrade the unit: 'nova-compute/2'
├── Resume the unit: 'nova-compute/2'
Enable nova-compute scheduler from unit: 'nova-compute/2'
Wait for up to 1800s for model 'test_model' to reach the idle state
Wait for up to 2400s for model 'test_model' to reach the idle state
Verify that the workload of 'nova-compute' has been upgraded on units: nova-compute/2
"""
)
Expand Down Expand Up @@ -547,7 +547,7 @@ def test_hypervisor_upgrade_plan_single_machine(model):
Wait for up to 300s for app 'cinder' to reach the idle state
Verify that the workload of 'cinder' has been upgraded on units: cinder/0
Enable nova-compute scheduler from unit: 'nova-compute/0'
Wait for up to 1800s for model 'test_model' to reach the idle state
Wait for up to 2400s for model 'test_model' to reach the idle state
Verify that the workload of 'nova-compute' has been upgraded on units: nova-compute/0
"""
)
Expand Down Expand Up @@ -639,7 +639,7 @@ def test_hypervisor_upgrade_plan_some_units_upgraded(model):
Wait for up to 300s for app 'cinder' to reach the idle state
Verify that the workload of 'cinder' has been upgraded on units: cinder/2
Enable nova-compute scheduler from unit: 'nova-compute/2'
Wait for up to 1800s for model 'test_model' to reach the idle state
Wait for up to 2400s for model 'test_model' to reach the idle state
Verify that the workload of 'nova-compute' has been upgraded on units: nova-compute/2
"""
)
Expand Down
Loading

0 comments on commit 8f1300c

Please sign in to comment.