Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SetTables are missing on_clear #601

Open
cristianmatache opened this issue Jan 11, 2024 · 1 comment
Open

SetTables are missing on_clear #601

cristianmatache opened this issue Jan 11, 2024 · 1 comment

Comments

@cristianmatache
Copy link

cristianmatache commented Jan 11, 2024

mode has a method to clear all the elements in a set (https://github.com/faust-streaming/mode/blob/c0aa58181432402dca30aa5978179383863a185a/mode/utils/collections.py#L546-L548) which calls an .on_clear callback.
This .on_clear callback method is empty by default (https://github.com/faust-streaming/mode/blob/c0aa58181432402dca30aa5978179383863a185a/mode/utils/collections.py#L535-L536)

However, faust does not implement on_clear at all.

This means that the in-memory data structure is cleared but the change is not reflected in the changelog topic. Upon recovery elements that were previously cleared may re-appear. I think the fix would be very simple:

  • adding a new operation here:
    OPERATION_ADD: int = 0x1
    OPERATION_DISCARD: int = 0x2
    OPERATION_UPDATE: int = 0xF
  • implementing on_clear similar to
    def on_add(self, value: VT) -> None:
    self.manager.send_changelog_event(self.key, OPERATION_ADD, value)

    maybe setting the value to a tombstone?
  • clearing the set when replaying the messages from the changelog

    faust/faust/tables/sets.py

    Lines 95 to 111 in 4a09533

    def apply_changelog_event(self, operation: int, value: Any) -> None:
    if operation == OPERATION_ADD:
    self.data.add(value)
    elif operation == OPERATION_DISCARD:
    self.data.discard(value)
    elif operation == OPERATION_UPDATE:
    tup = cast(Iterable[List], value)
    added: List
    removed: List
    added, removed = tup
    self.data |= set(added)
    self.data -= set(removed)
    else:
    raise NotImplementedError(
    f"Unknown operation {operation}: key={self.key!r}"
    )
@bboggs-streambit
Copy link

bboggs-streambit commented May 23, 2024

@cristianmatache @wbarnha , Since on_key_del is implemented for Table (https://github.com/faust-streaming/faust/blob/master/faust/tables/table.py#L86-L94) Could we simply (naively) implement this in terms of __delitem__?,

e.g. something like, in Table define

    def on_clear(self):
      for k in self.data.keys():
        del self[k]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants