Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated Watermark based GC and Transient Quota allocation #134

Open
aarshkshah1992 opened this issue Aug 1, 2022 · 4 comments
Open

Automated Watermark based GC and Transient Quota allocation #134

aarshkshah1992 opened this issue Aug 1, 2022 · 4 comments
Assignees

Comments

@aarshkshah1992
Copy link
Contributor

aarshkshah1992 commented Aug 1, 2022

This is a meta-issue to track the work of introducing an automated watermark based LRU GC of transients along with a quota reservation mechanism to allow for downloading transients whose size we do not know upfront.

The work is spread across multiple PRs.

High level overview

  • The dagstore now performs automated high->low watermark based GC for transient files.

  • Users who want to use this feature will have to configure a maximum size for the transients directory and the dagstore guarantees that the size of the transients directory will never exceed that limit.

  • Users will also have to configure a high and low watermark for the transients directory. The dagstore will kickstart an automated GC when it detects that the size of the transients directory has crossed the high watermark and will attempt to bring down the directory size below the low watermark threshold.

  • Users will have to configure a GC Strategy that will recommend the order in which reclaimable shards should be GC'd by the automated GC mechanism. The dagstore comes inbuilt with an LRU GC Strategy but users are free to implement their own. See the documentation of GarbageCollectionStrategy for more details.

  • A quota reservation mechanism has been introduced for downloading transients whose size we do not know upfront. To download such a CAR, the downloader will first get a reservation from the dagstore for a preconfigured number of bytes, then download those many bytes and then go back to the allocator for more reservation if it hasn't finished downloading the transient. In the end, it will release unused reserved bytes back to the allocator.

  • The existing manual GC mechanism works as is and no changes have been made to it.

Known Edge Case

There is an unhandled known edge case in the code.

If a group of concurrent transients downloads end up reserving all the available space in the transients directory but not enough to satisfy their individual downloads, then all of them will end up back-off retrying together for more space to become available. However, no space will become available till one of them exhausts the number of backoff-retry attempts -> fails the download -> releases reserved space. Thus, the dagstore will not make any progress with new downloads till one of the download fails and releases it's reservation.

However, this edge case should be mitigated by:

  1. Rate limiting the number of concurrent transients fetches
  2. Giving higher reservations to older downloads vs newer downloads.

PRs

  1. Upgrader should reserve and release allocations if transient size is unknown. Upgrader should reserve and release allocations if transient size is unknown #130 .
  2. Dagstore event loop does automated watermark based gc and handles quota allocations and reservations. Dagstore gc and event loop changes #131 .
  3. Interface for extensible GC with a default LRU implementation. Interfaces for Extendable GC and LRU implementation #132 .
  4. Config for Automated GC and tests for the entire feature. Config for Automated GC and tests for this feature #133 .
@raulk
Copy link
Member

raulk commented Oct 17, 2022

Users will also have to configure a high and low watermark for the transients directory. The dagstore will kickstart an automated GC when it detects that the size of the transients directory has crossed the high watermark and will attempt to bring down the directory size below the low watermark threshold.

With two-watermarks systems, the goal tends to be to keep the value between the watermarks. What's described here seems to be more of a trigger/target system? ("When value is above , activate GC to bring it to, or below, ")

@raulk
Copy link
Member

raulk commented Oct 17, 2022

Known Edge Case

The edge case seems pretty dangerous. Is it possible to identify this livelock situation in the garbage collector, and interrupt transient downloads to vacate more space?

@raulk
Copy link
Member

raulk commented Oct 17, 2022

Note that there are new edge cases that emerge from such situations, e.g. a malicious user forcing the system to download a huge transient to DoS all other active downloads.

@raulk
Copy link
Member

raulk commented Oct 17, 2022

Which protocols are unable to report a shard size in your use case? Having unknown shard sizes is acceptable for trusted scenarios, but definitely a no-go for untrusted/adversarial scenarios. An attacker may exploit the system by forcing it to (1) download a shard with unknown size from themselves, and (2) send infinite garbage (cheap to do).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants