Skip to content

Commit

Permalink
Add principles and components sections
Browse files Browse the repository at this point in the history
  • Loading branch information
tillrohrmann committed Aug 4, 2023
1 parent e791914 commit 1822a85
Showing 1 changed file with 18 additions and 14 deletions.
32 changes: 18 additions & 14 deletions docs/restate/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@ Services run on *Service Endpoints* which can be deployed independently of the r

![High level architecture](/img/restate-architecture.png)

## Scalability
## Principles

### Scalability

Restate is highly scalable because it shards the space of service invocations into partitions which are processed each by a dedicated *Partition processor*.
Each *Worker* runs a set of these *Partition processors*.
Expand All @@ -30,7 +32,7 @@ In order to react to changing workloads, the partitions can be merged and split
Dynamic partitioning is still under development.
:::

## Consistency and fault-tolerance
### Consistency and fault-tolerance

Consistency and fault-tolerance is achieved by replicating the *Partition processors* via [Raft](https://raft.github.io/).
All commands for a partition go through the Raft log, which ensures that all partition processor replicas stay consistent and can be recovered in case of failures.
Expand All @@ -45,18 +47,30 @@ Such an architecture is known as [Multi-Raft](https://tikv.org/deep-dive/scalabi
Currently, Restate runs as a single process. The distributed implementation is still under development.
:::

## State storage
## Components

### State storage

The Raft and *Partition processor* state is stored by the *Workers* in [RocksDB](https://github.com/facebook/rocksdb).
Using RocksDB allows for graceful spilling to disk and comparatively fast writes.
Moreover, it supports asynchronous checkpointing which is required to truncate the Raft log.

## State queries
### State queries

Restate makes its internal state accessible via a SQL interface.
Any client that supports the PostgreSQL wire protocol can be used to issue queries.
Internally, the SQL queries are executed using [DataFusion](https://github.com/apache/arrow-datafusion).

### Service registry

All servie meta information is maintained by the *Metas* via the service registry.
The service registry contains information about the registered services which includes the address of the service endpoints, the exposed service methods, their signatures and type definitions.
The service contracts and their type definitions are also exposed via [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) by the *Workers*.

Services are added to the registry by discovering service endpoints.
Upon discovery request, the *Metas* reach out to the specified service endpoint and retrieve all registered services, their methods and type definitions.
After discovery, the new service meta information is propagated to the *Workers* which enables invoking these services through Restate.

## Durable execution via journaling

Things always eventually go wrong, in particular at scale when running a distributed application.
Expand All @@ -71,16 +85,6 @@ A useful side-effect of the journal is that Restate can always suspend an invoca

![Durable execution](/img/durable-execution.png)

## Service registry

All servie meta information is maintained by the *Metas* via the service registry.
The service registry contains information about the registered services which includes the address of the service endpoints, the exposed service methods, their signatures and type definitions.
The service contracts and their type definitions are also exposed via [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) by the *Workers*.

Services are added to the registry by discovering service endpoints.
Upon discovery request, the *Metas* reach out to the specified service endpoint and retrieve all registered services, their methods and type definitions.
After discovery, the new service meta information is propagated to the *Workers* which enables invoking these services through Restate.

## Service invocation flow

All service invocations need to go through Restate which acts as a reverse-proxy for the registered services.
Expand Down

0 comments on commit 1822a85

Please sign in to comment.