diff --git a/docs/restate/architecture.md b/docs/restate/architecture.md index 784dd0e4..2bb080fc 100644 --- a/docs/restate/architecture.md +++ b/docs/restate/architecture.md @@ -11,7 +11,7 @@ The runtime is split into the control plane consisting of *Metas* and the data p The *Metas* are responsible for managing the service meta information and coordinating the *Workers*. -The *Workers* are responsible for invoking services, storing their journal and service state as well as maintaining processing order. +The *Workers* are responsible for invoking services, storing the invocation and service state as well as maintaining processing order. *Workers* also expose a *SQL* interface to query the runtime's internal state. Services run on *Service Endpoints* which can be deployed independently of the runtime. @@ -56,3 +56,48 @@ Moreover, it supports asynchronous checkpointing which is required to truncate t Restate makes its internal state accessible via a SQL interface. Any client that supports the PostgreSQL wire protocol can be used to issue queries. Internally, the SQL queries are executed using [DataFusion](https://github.com/apache/arrow-datafusion). + +## Durable execution via journaling + +Things always eventually go wrong, in particular at scale when running a distributed application. +Therefore, services that Restate invokes will also fail eventually. +In these cases, the services need to be re-invoked so that they can complete. +Ideally, services resume from the point where they failed without re-doing all the previous actions, effectively resulting into durable execution of services. + +The way Restate achieves durable execution is by recording the actions a service takes in a durable journal. +When re-invoking a service, the journal is sent alongside the service state to the service endpoint. +The service endpoint resumes the invocation by replaying the recorded journal entries without re-doing the actual operations (e.g. calling another service, storing state, running a side-effect, etc.). +A useful side-effect of the journal is that Restate can always suspend an invocation and resume it at a later point in time if need should arise. + +![Durable execution](/img/durable-execution.png) + +## Service registry + +All servie meta information is maintained by the *Metas* via the service registry. +The service registry contains information about the registered services which includes the address of the service endpoints, the exposed service methods, their signatures and type definitions. +The service contracts and their type definitions are also exposed via [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) by the *Workers*. + +Services are added to the registry by discovering service endpoints. +Upon discovery request, the *Metas* reach out to the specified service endpoint and retrieve all registered services, their methods and type definitions. +After discovery, the new service meta information is propagated to the *Workers* which enables invoking these services through Restate. + +## Service invocation flow + +All service invocations need to go through Restate which acts as a reverse-proxy for the registered services. +Restate knows where the services are running and invokes them on behalf of the clients. +By putting Restate in charge of the actual invocation, the system can achieve the following properties: + +* Enrich the invocation with extra information such as service and journal state +* Guarantee that keyed invocations are executed in order and isolation +* Retry failed invocations until they complete + +The service invocation flow is as follows: + +1. New invocations are received by the *Ingress service* running on the *Workers*. The ingress extracts the invocation key which determines the partition responsible for processing the invocation. +2. Based on the key, the invocation is forwarded to the right *Worker* running the *Partition processor* of the target partition. +3. The *Partition processor* makes sure that for a given key and service, the invocations are executed sequentially. +4. Once all preceding invocations have completed, the invocation request will be enriched with the service's state and sent to the service endpoint where it executes. +5. While executing, the runtime journals actions the service takes in order to be able to recover from failures without redoing these actions. +6. Once the invocation completes with a response, it will be sent back to the *Ingress service* and then to the invoking client. + +![Service invocation flow](/img/service-invocation-flow.png) diff --git a/static/img/durable-execution.png b/static/img/durable-execution.png new file mode 100644 index 00000000..eae6864e Binary files /dev/null and b/static/img/durable-execution.png differ diff --git a/static/img/service-invocation-flow.png b/static/img/service-invocation-flow.png new file mode 100644 index 00000000..2089c0e0 Binary files /dev/null and b/static/img/service-invocation-flow.png differ