Skip to content

Commit

Permalink
Update documentation about how to cancel/kill invocations
Browse files Browse the repository at this point in the history
This commit updates the documentation about how to cancel/kill invocations
wrt to the latest changes. The main change is that invocations can now
be canceled via DELETE <RESTATE_META_ENDPOINT>/invocations/<INVOCATION_IDENTIFIER>.
Invocations can now be killed by adding the query parameter mode=kill:
DELETE <RESTATE_META_ENDPOINT>/invocations/<INVOCATION_IDENTIFIER>?mode=kill

This fixes #233.
  • Loading branch information
tillrohrmann committed Jan 3, 2024
1 parent 297dc02 commit 53e64d4
Showing 1 changed file with 38 additions and 16 deletions.
54 changes: 38 additions & 16 deletions docs/services/invocation.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ There are different ways to invoke a Restate service:

### Over HTTP

You can invoke services over HTTP 1.1 or higher.
You can invoke services over HTTP 1.1 or higher.
Request/response bodies should be encoded as either JSON or Protobuf.
You can send requests directly from the browser or via `curl` without generating a client.

Expand Down Expand Up @@ -115,7 +115,7 @@ $ curl -X PATCH localhost:9070/services/org.example.ExampleService -H 'content-t
```
You can revert it back to public with `{"public": true}`. Private services can still be reached by other Restate services.
For more details on the API, refer to the [admin API docs](/references/admin-api#tag/service/operation/modify_service).
For more details on the API, refer to the [admin API docs](/references/admin-api#tag/service/operation/modify_service).

Check warning on line 118 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [Google.WordList] Use 'administrator' instead of 'admin'. Raw Output: {"message": "[Google.WordList] Use 'administrator' instead of 'admin'.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 118, "column": 44}}}, "severity": "WARNING"}
## Invocation identifier
Expand All @@ -141,38 +141,60 @@ The Invocation identifier is opaque and its current format should not be relied
## Cancel an invocation
:::caution
If an invocation takes too long to complete and, therefore, is no longer of interest, you can cancel it.

Check warning on line 144 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [write-good.TooWordy] 'therefore' is too wordy. Raw Output: {"message": "[write-good.TooWordy] 'therefore' is too wordy.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 144, "column": 50}}}, "severity": "WARNING"}

Check warning on line 144 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [write-good.E-Prime] Try to avoid using 'is'. Raw Output: {"message": "[write-good.E-Prime] Try to avoid using 'is'.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 144, "column": 61}}}, "severity": "INFO"}
Canceling an invocation allows it to free any resources it is holding and roll back any changes it has made so far.

Check warning on line 145 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [Google.Contractions] Use 'it's' instead of 'it is'. Raw Output: {"message": "[Google.Contractions] Use 'it's' instead of 'it is'.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 145, "column": 57}}}, "severity": "INFO"}

Check warning on line 145 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [write-good.TooWordy] 'it is' is too wordy. Raw Output: {"message": "[write-good.TooWordy] 'it is' is too wordy.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 145, "column": 57}}}, "severity": "WARNING"}

Check warning on line 145 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [write-good.E-Prime] Try to avoid using 'is'. Raw Output: {"message": "[write-good.E-Prime] Try to avoid using 'is'.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 145, "column": 60}}}, "severity": "INFO"}
In order to roll back changes correctly, the service handlers need to contain the necessary compensation logic.

Check warning on line 146 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [write-good.TooWordy] 'In order to' is too wordy. Raw Output: {"message": "[write-good.TooWordy] 'In order to' is too wordy.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 146, "column": 1}}}, "severity": "WARNING"}

Check warning on line 146 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [write-good.Weasel] 'correctly' is a weasel word! Raw Output: {"message": "[write-good.Weasel] 'correctly' is a weasel word!", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 146, "column": 31}}}, "severity": "WARNING"}
If the required compensation logic is implemented, then the service state stays consistent even in the presence of cancellations.

Check warning on line 147 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [write-good.E-Prime] Try to avoid using 'is'. Raw Output: {"message": "[write-good.E-Prime] Try to avoid using 'is'.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 147, "column": 36}}}, "severity": "INFO"}

Check warning on line 147 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [Google.Passive] In general, use active voice instead of passive voice ('is implemented'). Raw Output: {"message": "[Google.Passive] In general, use active voice instead of passive voice ('is implemented').", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 147, "column": 36}}}, "severity": "INFO"}
At the moment, gracefully cancelling an invocation is not supported. It will be supported in future Restate releases.
The cancellation process works recursively in the following way:
First, Restate tries to cancel the leaves of the current invocation, i.e. complete pending sleeps and awakeables or try to cancel calls to other services.

Check failure on line 150 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [Google.Latin] Use 'that is' instead of 'i.e.'. Raw Output: {"message": "[Google.Latin] Use 'that is' instead of 'i.e.'.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 150, "column": 70}}}, "severity": "ERROR"}

Check failure on line 150 in docs/services/invocation.md

View workflow job for this annotation

GitHub Actions / Test build

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Awakeables' instead of 'awakeables'. Raw Output: {"message": "[Vale.Terms] Use 'Awakeables' instead of 'awakeables'.", "location": {"path": "docs/services/invocation.md", "range": {"start": {"line": 150, "column": 103}}}, "severity": "ERROR"}
Once the leaves are canceled, the service handler will be notified about the cancellation via a terminal error being thrown at the call site.
This allows the handler to run its specific compensation logic.
The response of the handler will then be propagated back to its caller where it will continue with the cancellation process.
:::note
Canceling an invocation is a non-blocking operation which in some rare cases needs to be retried by the user.
:::
```shell
$ curl -X DELETE <RESTATE_META_ENDPOINT>/invocations/<INVOCATION_IDENTIFIER>
```
For example:
```shell
$ curl -X DELETE http://localhost:9070/invocations/T4pIkIJIGAsBiiGDV2dxK7PkkKnWyWHE
```
For more details on the API, refer to the [admin API docs](/references/admin-api).
## Kill an invocation
When an invocation fails, Restate retries by default until it can make progress.
For example, if there's a network partitioning, Restate keeps retrying until it can reach the deployment and make progress.
In a few cases, it is not possible for Restate to cancel an invocation.
For example, if the service deployment is permanently unavailable, Restate cannot invoke the service handler to run its compensation logic which is needed to complete the cancellation.
For these cases, Restate provides the ability to kill an invocation.
There are some cases where it is impossible for an invocation to make progress.
A good example is when your code runs a non-deterministic action: If the invocation is suspended and re-scheduled afterwards, the replay of the invocation might lead to a different code path, generating an invalid journal and failing the invocation indefinitely.
In such cases, you can request Restate to kill the invocation, thereby aborting its execution as soon as possible.
If the invocation is ongoing, killing the invocation **will not** roll back its progress.
Killing an invocation means that every call in the call tree of the invocation will be stopped immediately without given the service handler a chance to react.
This entails that killing the invocation **will not** roll back its progress.
:::danger
:::note
Background calls and delayed calls will not be killed because they are considered detached from the originating call tree.
:::
:::danger
Killing an invocation might leave the service instance in an inconsistent state, just like how killing a process in your operating system may cause the open files to become corrupted. Use it with caution and try to fix the invocation in other ways before resorting to killing it.

:::
To kill an invocation, send the following request to the Restate admin API:
```shell
$ curl -X DELETE <RESTATE_META_ENDPOINT>/invocations/<INVOCATION_IDENTIFIER>
$ curl -X DELETE <RESTATE_META_ENDPOINT>/invocations/<INVOCATION_IDENTIFIER>?mode=kill
```
For example:
```shell
$ curl -X DELETE http://localhost:9070/invocations/T4pIkIJIGAsBiiGDV2dxK7PkkKnWyWHE
$ curl -X DELETE http://localhost:9070/invocations/T4pIkIJIGAsBiiGDV2dxK7PkkKnWyWHE?mode=kill
```
For more details on the API, refer to the [admin API docs](/references/admin-api).
Expand All @@ -181,10 +203,10 @@ For more details on the API, refer to the [admin API docs](/references/admin-api
For each retry attempt, Restate internally holds an inactivity timer to track whether the service is active and generating some work, such as setting state, invoking other services, etc. This timer can be configured with the option [`worker.invoker.inactivity_timeout`](https://docs.restate.dev/restate/configuration).
Once the `inactivity_timeout` is fired, Restate tries to gracefully suspend the invocation while waiting for an event that triggers the resumption of the invocation.
Once the `inactivity_timeout` is fired, Restate tries to gracefully suspend the invocation while waiting for an event that triggers the resumption of the invocation.
When suspending, the Restate SDK will continue executing the service code until it reaches a _suspension point_, that is a point in your service code where it's safe to interrupt the execution, for example when `await`ing on a response from another service.

When suspending, Restate internally starts another timer to protect Restate from connection issues and/or misbehaving code/SDKs that prevent the tear down of the connection. This timer can be configured with the option [`worker.invoker.abort_timeout`](https://docs.restate.dev/restate/configuration).
Once the `abort_timeout` is fired, the connection to the deployment endpoint is closed, and all in-flight progress is discarded.

If you have [side effects](sdk/side-effects) that take more than `inactivity_timeout + abort_timeout` to execute, you might need to tune these timeouts accordingly, for example by increasing the `inactivity_timeout` to a value larger than the expected side effect duration.
If you have [side effects](sdk/side-effects) that take more than `inactivity_timeout + abort_timeout` to execute, you might need to tune these timeouts accordingly, for example by increasing the `inactivity_timeout` to a value larger than the expected side effect duration.

0 comments on commit 53e64d4

Please sign in to comment.