Skip to content
This repository has been archived by the owner on Oct 7, 2022. It is now read-only.

sync 07

Nicolas Sebrecht edited this page Apr 7, 2016 · 5 revisions

two state workers and controllers

Modified version of 02.

In this case, the state backends are run in a dedicated worker but accessed from a controller of the drivers.


      {worker}                    {worker}                            {worker}
+---------+-------+             +----------+                     +-------+----------+
|         |       |             |          |                     |       |          |
|  left   | state |   (drives)  |          |      (drives)       | state |  right   |
|  driver | contr |<------------|  engine  +-------------------->| contr |  driver  |
|         |       |             |          |                     |       |          |
+---------+-------+             +----------+                     +-------+----------+
              | |                                                  | |
              | |                                                  | |
              | |                                                  | |
              | |                                                  | |
       (read) | |             (write)                              | | (read)
              | +--------------------------------------+           | |
              |                                        |           | |
              |               (write)                  |           | |
  {worker}    |      +---------------------------------------------+ |     {worker}
+----------+  |      |                                 |             |   +----------+
|          |  |      |                                 |             |   |          |
|  left    |<-+      |                                 |             +-->|  right   |
|  state   |         |                                 |                 |  state   |
|          |<--------+                                 +---------------->|          |
+----------+                                                             +----------+

Use case

  • In the engine:
left.search(searchConditions) # Async
right.search(searchConditions) # Async
  • In the controller (so in the driver worker):
self.state.search(searchConditions) # Async
messages = self.driver.search_sync(searchConditions) # Sync
stateMessages = self.state.getSearchResult_sync() # Sync

# 2-way merge algo between stateMessages and messages.

# Returns only the messages that changed since last sync.
return mergedMessages # To the engine.
  • In the engine:
leftMessages = left.getSearchResult_sync() # Sync
rightMessages = right.getSearchResult_sync() # Sync

# 2-way merge between both sides. Would only be usefull to discard identical
# changes made on both sides.

# Sample: apply the changes on the left side.
left.update(rightMessages) # Async
  • In the controller (so in the driver worker):
for message in mergedMessages:
  success = self.driver.updateMessage(message)
  if success is True:
    stateSuccess = self.state.updateMessage(message)
    if stateSuccess is not True:
      # Bad. We might take action here like possibly tag the email on disk
      # to avoid further unaligned state at "low performance penalty".

Concerns

Misalignments

Misalignments might arise when:

  • left driver and left state match

and

  • right driver and right state match

and

  • left and right data are differents.

Possible fixes

  1. At discover time, each state controller compares the current data (in driver) with the other state.
    • Pros:
      • Design is very resilient on update errors.
      • Channel to our state might be removed.
    • Cons:
      • ?

The fix (1) for misalignments enable good resilience on update errors. This won't work on write failures.

left driver --- left SC --- engine --- right SC --- right driver
left state  --------------------------------------- right driver
  • t1: discover
M1 +R --- M1L +R --- engine --- M1R -F --- M1 -F
M1 ------------------------------------------ M1
  • t2: merge & pass
M1 +R --- M1R +R-F --- merge --- M1L +R-F --- M1 -F
M1 --------------------------------------------- M1
  • t3: updates (left fail)
M1 +R --- M1R +R-F --- merge --- M1L +R-F --- M1 +R-F
M1 +R-F ------------------------------------------ M1
Next sync
  • t1: discover
M1 +R --- M1L +R --- merge --- (none) --- M1 +R-F
M1 +R-F -------------------------------------- M1
  • t2: merge & pass
M1 +R --- (none) --- merge --- M1 +R --- M1 +R-F
M1 +R-F -------------------------------------- M1

Quite wrong...

The 2-way merge in the engine

The 2-way merge in the engine aims at filtering unneccessary changes when both sides changed the same way.

However, if the engine filters updates then the respective state is never updated with the change. This would make state controllers to always provide unneccessary messages, filtered again and again for next syncs.

Possible fixes

  1. Remove the 2-way merge in the engine.
    • Pros:
      • Less code.
    • Cons:
      • Drivers might get unneccessary changes.
  2. Enable the engine to update the states.
    • Pros:
      • Engine still filters unneccessary changes.
    • Cons:
      • Two more communication channels.
  3. The engine mark the unneccessary changes in the messages objects as unneccessary for the drivers. So, the other state controller propagates those changes to the other state only (not to the drivers) at update time.
    • Pros:
      • Message objects metadata are fully described. This should make both support for fine-grained tuning and exposing conflicts/changes to the rascal easy.
    • Cons:
      • ?
  4. At update time, mark the unneccessary changes in the other state controller.
    • Pros:
      • ?
    • Cons:
      • We must retain the collection of changes gathered at discover time (RAM overhead) in each state controller or make other reads to our state (might reduce perfs).
      • The same filtering results are computed twice (at each side).
  5. The engine filters and stores the filtered changes to its own database on disk.
    • Pros:
      • ?
    • Cons:
      • One more channel and one more backend to get this async.
      • Yet another database on disk.
      • The database has to be merged at some point to update both states.

We need a way to avoid drivers to try to apply unneccessary changes. Not trying to filter out unneccessary changes would harm (performence, at least). There is no obvious solution for this concern. However, the fix (3) looks to enable very good flexibility at low cost. It is compatible with the fix (1) about misalignments.

Pros

  • The sync engine and controllers hold simple 2-way merge algos.
  • Fits well in a fully async environment.
  • Recover on errors at low performance penalty.
  • Globally performance friendly.

Cons

  • The full merging code algo is dispatched. Is it really a cons?
  • The engine needs a way to update the chain of controllers.

Compared to 02

Pros

  • Better performances.

Cons

  • Requires more workers.
  • Requires more channels.
Clone this wiki locally