diff --git a/hermes-common/src/main/java/pl/allegro/tech/hermes/infrastructure/zookeeper/ZookeeperTopicRepository.java b/hermes-common/src/main/java/pl/allegro/tech/hermes/infrastructure/zookeeper/ZookeeperTopicRepository.java index cf0da2e965..31c6368173 100644 --- a/hermes-common/src/main/java/pl/allegro/tech/hermes/infrastructure/zookeeper/ZookeeperTopicRepository.java +++ b/hermes-common/src/main/java/pl/allegro/tech/hermes/infrastructure/zookeeper/ZookeeperTopicRepository.java @@ -93,10 +93,18 @@ public void createTopic(Topic topic) { * and upon receiving 'KeeperException.NotEmptyException' it tries to remove children recursively * and then retries the node removal. This means that there is a potentially large time gap between * removal of 'topic/subscriptions' node and 'topic' node, especially when topic removal is being done - * in remote DC. It turns out that 'PathChildrenCache' used for 'HierarchicalCacheLevel' in - * consumers and management recreates 'topic/subscriptions' node when deleted. If the recreation is faster - * than the removal of 'topic' node, than the whole removal process must be repeated resulting in a lengthy loop - * that may even result in StackOverflowException. + * in remote DC. + *
+ * It turns out that 'PathChildrenCache' used by 'HierarchicalCacheLevel' in + * Consumers and Frontend listens for 'topics/subscriptions' changes and recreates that node when deleted. + * If the recreation happens between the 'topic/subscriptions' and 'topic' node removal + * than the whole removal process must be repeated resulting in a lengthy loop that may even result in StackOverflowException. + * Example of that scenario would be + * 1. DELETE 'topic' - issued by management, fails with KeeperException.NotEmptyException + * 2. DELETE 'topic/subscriptions' - issued by management, succeeds + * 3. CREATE 'topic/subscriptions' - issued by frontend, succeeds + * 4. DELETE 'topic' - issued by management, fails with KeeperException.NotEmptyException + * [...] *
* To solve this we must remove 'topic' and 'topic/subscriptions' atomically. However, we must also remove * other 'topic' children. Transaction API does not allow for 'optional' deletes so we: