sync 04

local state on each driver

In this setting, we attach an independent local state on every driver via a controller. The local state, remembers "last known state for the data on that driver", ~~and is not linked in any way to the other end of the sync~~.

Each controller acts regardless the other side. Since this would not make sense in the real world, the state data is stored in a dedicated namespace for each account. This avoids possible conflicts.


      {worker}                    {worker}                            {worker}
+---------+-------+             +----------+                     +-------+----------+
|         |       |   (drives)  |          |      (drives)       |       |          |
|  driver | state |<------------|  engine  +-------------------->| state |  driver  |
|         |       |             |          |                     |       |          |
+---------+-------+             +----------+                     +-------+----------+

Use case

In the engine:

left.search(searchConditions) # Async
right.search(searchConditions) # Async

In either state controller:


stateMessages = self.state.search(searchConditions) # Sync: might worth putting
                                                    # the backend out in a worker.
messages = self.driver.search(searchConditions) # Sync

# 2-way merge.

return mergedMessages # To the engine.

I'm stuck. Useful state data for the controller is what was succesfully written on the other side. This implies forwarding successfull writes to each other side.

merge engine - variant 1

The state-controller has an API that returns a list of changes since last known state. For example: add new message with uid xxx, or add read flag to message with uid xxx.

The merge engine will collect the changes one end and execute them on the other. It needs to be aware of conflicting actions, and somehow resolve them, either automatically (one end has priority), failing, or requesting user action.

When the engine is satisfied that a uid is in sync, it should notify both ends to update their cached state. The cached state should not be updated by the controller on reads, probably could be updated on writes.

This would play nice with features like offlineimap's maxage. On the other hand, both replicas may drift away from each other. So maybe we need to list all the messages on both ends and make sure that the data coincides after a sync.

merge engine - variant 2

We do a usual 3-way merge, with a first step that compares the last known states on both ends. If they differ we fail and ask the user to take action. If they coincide, we can do a usual 3-way merge as in offlineimap.

Last known states may diverge in case of failure, or interruption. So it is something we should not regard as rare. Maybe this variant is not an option then...

Pros

The state is local to a driver, so no need to propagate successful writes from one driver to the other. It probably makes it more robust in case of failure.
The merge engine seems amenable to a reasonably async implementation. It can start working while the "change messages" keep comming. It requires some degree of synchronization between both ends. One possibility: the messages should come in uid order, so that the merge engine can be sure that a change is free of conflict, when both ends are passed beyond that uid.

Cons

The engine will be more complex, or at least different to the offlineimap approach.
There may be irreconciliable states in which the last known state on both ends do not match, and no reasonable action is available. Those will be rare, but need to be dealt with.
The engine needs a way to update the chain of controllers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly