Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Changes requested for segment replication integration. #4628

Open
dreamer-89 opened this issue Sep 28, 2022 · 2 comments
Open
Labels
enhancement Enhancement or improvement to existing feature or request Indexing:Replication Issues and PRs related to core replication framework eg segrep Storage:Durability Issues and PRs related to the durability framework

Comments

@dreamer-89
Copy link
Member

dreamer-89 commented Sep 28, 2022

Coming from #4555, this isssue captures the changes needed on the remote store side for segment replication integration.

  1. Add StorefileMetadata to enable comparison of segment files. Remote store provides UploadedSegmentMetadata but it differs from lucene checksum and needs hash and length attributes. With existing local store implementation, segrep relies on store file metadata passed as part of CheckpointInfoResponse. Replica compares diff out of its store files in local directory store.
  2. In order to provide incremental backups (refresh without new commit points), remote store needs to provide the capability for storing and retrieving the SegmentInfos. Currently, segrep builds the SegmentInfos from bytes transferred over transport from primary. A similar support is needed from remote store where SegmentInfos can be stored from primary and retrieved on replicas. A better solution here would be to commit rather than refresh on remote store (do not provide remote store refresh option). In this scenario, replica can generate the SegmentInfos from the commit point (by first downloading the segment files & then read it from store directory). This option simplifies recovery & failover as well.
  3. Is it possible to get rid of fsyncs on IW commits on primary as remote store already provides durability guarantees. Translogs can be used for recovery in case of failures.
  4. I see remote store refresh listener purges last commit points older than N. This can be problematic for an aggressive primary and replica never catches up. Is it possible to have a configurable data deletion policy ?

Remote store design: #2700
Segment replication integration: #4555

@dreamer-89 dreamer-89 added enhancement Enhancement or improvement to existing feature or request untriaged labels Sep 28, 2022
@saratvemulapalli saratvemulapalli added distributed framework Storage:Durability Issues and PRs related to the durability framework labels Sep 29, 2022
@mch2
Copy link
Member

mch2 commented Sep 29, 2022

Is it possible to get rid of fsyncs on IW commits on primary as remote store already provides durability guarantees. Translogs can be used for recovery in case of failures.

+1. This is what I was suggesting here. The cost of performing a flush without fsync is effectively a refresh yet we still write a segments_N that can be pushed to the store. This also has the added benefit of more accurate seqNo/checkpoint data sent between shards stored in user data of the commit, with the incremental points we are unable to use this. The tricky part here is we are updating default refresh/flush behavior where we would now be performing the new fsyncless commit on an interval instead of a refresh.

Another change that would be required is the sequence of uploads to the remote store & when xlog is truncated. The uploads occur now as a refresh listener so we'd have to guarantee those ops are preserved until the upload completes.

@dreamer-89
Copy link
Member Author

Add StorefileMetadata to enable comparison of segment files. Remote store provides UploadedSegmentMetadata but it differs from lucene checksum and needs hash and length attributes. With existing local store implementation, segrep relies on store file metadata passed as part of CheckpointInfoResponse. Replica compares diff out of its store files in local directory store.

This issue tracks only this work. Remaining points are already discussed in #4555. More details below.

In order to provide incremental backups (refresh without new commit points), remote store needs to provide the capability for storing and retrieving the SegmentInfos. Currently, segrep builds the SegmentInfos from bytes transferred over transport from primary. A similar support is needed from remote store where SegmentInfos can be stored from primary and retrieved on replicas. A better solution here would be to commit rather than refresh on remote store (do not provide remote store refresh option). In this scenario, replica can generate the SegmentInfos from the commit point (by first downloading the segment files & then read it from store directory). This option simplifies recovery & failover as well.

Uploading SegmentInfos is not needed as defined as alternative above and discussed in #4555

Is it possible to get rid of fsyncs on IW commits on primary as remote store already provides durability guarantees. Translogs can be used for recovery in case of failures.

This is already discussed in #4555. A bounded fsync approach will be used which keeps the recovery time bounded.

I see remote store refresh listener purges last commit points older than N. This can be problematic for an aggressive primary and replica never catches up. Is it possible to have a configurable data deletion policy ?

Discussed in #4555, a combination of saving last N commit points and keeping last M days data will be used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing:Replication Issues and PRs related to core replication framework eg segrep Storage:Durability Issues and PRs related to the durability framework
Projects
Status: 🆕 New
Development

No branches or pull requests

5 participants