sync 07

two state workers and controllers

Modified version of 02.

In this case, the state backends are run in a dedicated worker but accessed from a controller of the drivers.


      {worker}                    {worker}                            {worker}
+---------+-------+             +----------+                     +-------+----------+
|         |       |             |          |                     |       |          |
|  left   | state |   (drives)  |          |      (drives)       | state |  right   |
|  driver | contr |<------------|  engine  +-------------------->| contr |  driver  |
|         |       |             |          |                     |       |          |
+---------+-------+             +----------+                     +-------+----------+
              | |                                                  | |
              | |                                                  | |
              | |                                                  | |
              | |                                                  | |
       (read) | |             (write)                              | | (read)
              | +--------------------------------------+           | |
              |                                        |           | |
              |               (write)                  |           | |
  {worker}    |      +---------------------------------------------+ |     {worker}
+----------+  |      |                                 |             |   +----------+
|          |  |      |                                 |             |   |          |
|  left    |<-+      |                                 |             +-->|  right   |
|  state   |         |                                 |                 |  state   |
|          |<--------+                                 +---------------->|          |
+----------+                                                             +----------+

Use case

In the engine:

left.search(searchConditions) # Async
right.search(searchConditions) # Async

In the controller (so in the driver worker):

self.state.search(searchConditions) # Async
messages = self.driver.search_sync(searchConditions) # Sync
stateMessages = self.state.getSearchResult_sync() # Sync

# 2-way merge algo between stateMessages and messages.

# Returns only the messages that changed since last sync.
return mergedMessages # To the engine.

In the engine:

leftMessages = left.getSearchResult_sync() # Sync
rightMessages = right.getSearchResult_sync() # Sync

# 2-way merge between both sides. Would only be usefull to discard identical
# changes made on both sides.

# Sample: apply the changes on the left side.
left.update(rightMessages) # Async

In the controller (so in the driver worker):

for message in mergedMessages:
  success = self.driver.updateMessage(message)
  if success is True:
    stateSuccess = self.state.updateMessage(message)
    if stateSuccess is not True:
      # Bad. We might take action here like possibly tag the email on disk
      # to avoid further unaligned state at "low performance penalty".

Concerns

Misalignments

Misalignments might arise when:

left driver and left state match

and

right driver and right state match

and

left and right data are differents.

Possible fixes

At discover time, each state controller compares the current data (in driver) with the other state.
- Pros:
  - Design is very resilient on update errors.
  - Channel to our state might be removed.
- Cons:
  - ?

~~The fix (1) for misalignments enable good resilience on update errors.~~ This won't work on write failures.

left driver --- left SC --- engine --- right SC --- right driver
left state  --------------------------------------- right driver

t1: discover

M1 +R --- M1L +R --- engine --- M1R -F --- M1 -F
M1 ------------------------------------------ M1

t2: merge & pass

M1 +R --- M1R +R-F --- merge --- M1L +R-F --- M1 -F
M1 --------------------------------------------- M1

t3: updates (left fail)

M1 +R --- M1R +R-F --- merge --- M1L +R-F --- M1 +R-F
M1 +R-F ------------------------------------------ M1

Next sync

t1: discover

M1 +R --- M1L +R --- merge --- (none) --- M1 +R-F
M1 +R-F -------------------------------------- M1

t2: merge & pass

M1 +R --- (none) --- merge --- M1 +R --- M1 +R-F
M1 +R-F -------------------------------------- M1

Quite wrong...

The 2-way merge in the engine

The 2-way merge in the engine aims at filtering unneccessary changes when both sides changed the same way.

However, if the engine filters updates then the respective state is never updated with the change. This would make state controllers to always provide unneccessary messages, filtered again and again for next syncs.

Possible fixes

Remove the 2-way merge in the engine.
- Pros:
  - Less code.
- Cons:
  - Drivers might get unneccessary changes.
Enable the engine to update the states.
- Pros:
  - Engine still filters unneccessary changes.
- Cons:
  - Two more communication channels.
The engine mark the unneccessary changes in the messages objects as unneccessary for the drivers. So, the other state controller propagates those changes to the other state only (not to the drivers) at update time.
- Pros:
  - Message objects metadata are fully described. This should make both support for fine-grained tuning and exposing conflicts/changes to the rascal easy.
- Cons:
  - ?
At update time, mark the unneccessary changes in the other state controller.
- Pros:
  - ?
- Cons:
  - We must retain the collection of changes gathered at discover time (RAM overhead) in each state controller or make other reads to our state (might reduce perfs).
  - The same filtering results are computed twice (at each side).
The engine filters and stores the filtered changes to its own database on disk.
- Pros:
  - ?
- Cons:
  - One more channel and one more backend to get this async.
  - Yet another database on disk.
  - The database has to be merged at some point to update both states.

We need a way to avoid drivers to try to apply unneccessary changes. Not trying to filter out unneccessary changes would harm (performence, at least). There is no obvious solution for this concern. However, the fix (3) looks to enable very good flexibility at low cost. It is compatible with the fix (1) about misalignments.

Pros

The sync engine and controllers hold simple 2-way merge algos.
Fits well in a fully async environment.
Recover on errors at low performance penalty.
Globally performance friendly.

Cons

The full merging code algo is dispatched. Is it really a cons?
The engine needs a way to update the chain of controllers.

Compared to 02

Pros

Better performances.

Cons

Requires more workers.
Requires more channels.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync 07

two state workers and controllers

Use case

Concerns

Misalignments

Possible fixes

Next sync

The 2-way merge in the engine

Possible fixes

Pros

Cons

Compared to 02

Pros

Cons

Clone this wiki locally