Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(v2): block cleaner #3637

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

feat(v2): block cleaner #3637

wants to merge 20 commits into from

Conversation

aleks-p
Copy link
Contributor

@aleks-p aleks-p commented Oct 17, 2024

Adds a block cleaner package for tracking deleted blocks. The purpose is:

  • preventing deleted blocks from re-entering the system because of retries / DLQ recovery
  • remove deleted blocks from the bucket after a grace period

Some other details:

  • block tombstones are stored in memory and in bolt db
  • the Raft leader applies a Raft command on an interval
  • when applying the command, each metastore instance removes expired tombstones from memory and disk
  • the Raft leader also removes the block from the bucket when it applies the command

TODO:

  • improve test coverage
  • Analyze performance

@aleks-p aleks-p requested a review from a team as a code owner October 17, 2024 23:56
@kolesnikovae
Copy link
Collaborator

Looks good overall, however, I think we need to make sure that only the current leader removes blocks from the object storage. I might be mistaken, but this is not the case in the current version

@aleks-p
Copy link
Contributor Author

aleks-p commented Oct 18, 2024

Looks good overall, however, I think we need to make sure that only the current leader removes blocks from the object storage. I might be mistaken, but this is not the case in the current version

I've reworked things a bit, the blockcleaner package now acts mainly as a store for the markers. The Raft command and handling is moved to the metastore package. Instead of a leader check we now check that the instance that submitted the command against the instance running the command (using a generated id for now, but could be switched to use raft server ids).

I've also reduced the number of transactions dramatically, deletion is done in a single update transaction. Still lacking extensive test coverage but looks more robust in a dev deployment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants