Skip to content

Commit

Permalink
Add docs for agent upgrade restart feature (#568)
Browse files Browse the repository at this point in the history
* Add docs for agent upgrade restart feature

* Add note in upgrading summary about restarting
  • Loading branch information
kilfoyle authored Oct 10, 2023
1 parent fe3a534 commit add4ac2
Showing 1 changed file with 58 additions and 3 deletions.
61 changes: 58 additions & 3 deletions docs/en/ingest-management/fleet/upgrade-elastic-agent.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,13 @@ date and time.

In most failure cases the {agent} may retry an upgrade after a short wait. The
wait durations between retries are: 1m, 5m, 10m, 15m, 30m, and 1h. During this
time, the {agent} may show up as "retrying" in the {fleet} UI.
//Note that you can abort an upgrade that is being retried. See <<abort-agent-upgrade>>.
time, the {agent} may show up as "retrying" in the {fleet} UI. As well, if agent
upgrades have been detected to have stalled, you can restart the upgrade process
for a <<restart-upgrade-single,single agent>> or in bulk for
<<restart-upgrade-multiple,multiple agents>>.

This approach simplifies the process of keeping your agents up to date. It also
saves you time because you dont need third-party tools or processes to
saves you time because you don't need third-party tools or processes to
manage upgrades.

By default, {agent}s require internet access to perform binary upgrades from
Expand Down Expand Up @@ -50,6 +52,12 @@ can perform the following upgrade-related actions:
|<<view-upgrade-status>>
|View the status of an upgrade, including upgrade metrics and agent logs.

|<<restart-upgrade-single>>
|Restart an upgrade process that has stalled for a single agent.

|<<restart-upgrade-multiple>>
|Do a bulk restart of the upgrade process for a set of agents.

|===


Expand Down Expand Up @@ -144,3 +152,50 @@ don't see the host name, try refreshing the page.
+
[role="screenshot"]
image::images/upgrade-failure.png[Agent logs showing upgrade failure]

[discrete]
[[restart-upgrade-single]]
== Restart an upgrade for a single agent

An {agent} upgrade process may sometimes stall. This can happen for various
reasons, including, for example, network connectivity issues or a delayed shutdown.

When an {agent} upgrade has been detected to be stuck, a warning indicator
appears on the UI. When this occurs, you can restart the upgrade from either the
*Agents* tab on the main {fleet} page or from the details page for any individual
agent.

Restart from main {fleet} page:

. From the **Actions** menu next to an agent that is stuck in an `Updating`
state, choose **Restart upgrade**.
. In the **Restart upgrade** window, select an upgrade version and click
**Upgrade agent**.

Restart from an agent details page:

. In {fleet}, in the **Host** column, click the agent's name. On the
**Agent details** tab, a warning notice appears if the agent is detected to have
stalled during an upgrade.
. Click *Restart upgrade*.
. In the **Restart upgrade** window, select an upgrade version and click
**Upgrade agent**.

[discrete]
[[restart-upgrade-multiple]]
== Restart an upgrade for multiple agents

When the upgrade process for multiple agents has been detected to have stalled,
you can restart the upgrade process in bulk.

. On the **Agents** tab, select any set of the agents that are indicated to be stuck, and click **Actions**.
. From the **Actions** menu, select **Restart upgrade <number> agents**.
. In the **Restart upgrade...** window, select an upgrade version.
. Select the amount of time available for the maintenance window. The upgrades
are spread out uniformly across this maintenance window to avoid exhausting
network resources.
+
To force selected agents to upgrade immediately when the upgrade is
triggered, select **Immediately**. Avoid using this setting for batches of more
than 10 agents.
. Restart the upgrades.

0 comments on commit add4ac2

Please sign in to comment.