Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for deals using a data segment index #154

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

willscott
Copy link
Contributor

A deal may, rather than a single car, be a series of concatinated car files with a segment index at the end containing an inclusion proof.
When such an index is found, we will allow reading multiple cars and taking valid cids found in all of them as the contents of the overall deal.

ref: filecoin-project/boost#1258

A deal may, rather than a single car, be a series of concatinated car files with a segment index at the end containing an inclusion proof.
When such an index is found, we will allow reading multiple cars and taking valid cids found in all of them as the contents of the overall deal.
@willscott willscott requested a review from dirkmc March 28, 2023 15:18
@willscott
Copy link
Contributor Author

filing filecoin-project/go-data-segment#9 to see if we can get a fixture to use for validation

@Wondertan
Copy link

This PR brings Filecoin implementation detail to Dagstore and multiple new dependencies. Is there a way to implement this using Dagstore's API on the layer above, rather then polluting its internals?

@willscott
Copy link
Contributor Author

willscott commented Apr 2, 2023

That's a good point, though carv2 indexes are in a way an equally opinionated existing equivalent. I'll see what it would look like for the 'get an index for this shard' to be fully handled by the caller

@raulk
Copy link
Member

raulk commented Apr 3, 2023

I think there's an implicit decision being taken here, which is to model segments as indexed entries over a single shard (= storage deal in the Filecoin context), instead of as dedicated shards. Establishing a shard for every segment was not possible in the past since there was no standard way of delineating logical units within a Filecoin deal. With PoDSI, this is now possible. Making shard == segment could let us keep the dagstore Filecoin-agnostic, but I suspect is not viable without additional complexity. @willscott could you walk us through the tradeoffs here?

@willscott
Copy link
Contributor Author

Making shard == segment could let us keep the dagstore Filecoin-agnostic, but I suspect is not viable without additional complexity. @willscott could you walk us through the tradeoffs here?

shard == segment

  • shard parse logic simplified as a single car file
  • leads to caching / transfer at sub-deal level, better reflecting the potential for different usage patterns of different cars packed together in a deal.
  • unsealed transfer of cars is no longer alone sufficient to re-create the deal - also need to to know order/offsets of them at least.

shard == deal

  • no change in mental model of what files-on-disk are for SP operators
  • the 'deal' is the object that is created / destroyed atomically
  • unsealing is only possible at a deal granularity
  • deals will remain a mix of single car vs segmented - so some file management will have to remain at a 'deal level' anyway

@raulk
Copy link
Member

raulk commented Apr 3, 2023

Thanks! I think the case for shard == segment isn't strong today, but it may strengthen over time as selective unsealing becomes available, PoDSI gains traction, and segment-based access patterns become a bottleneck and the next frontier for optimization. Agree with the implicit decision made here, but +1 to @Wondertan's request to invert the control of the index population, so that the API caller can feed the segments.

@willscott
Copy link
Contributor Author

cc @Kubuxu

willscott added a commit that referenced this pull request Apr 7, 2023
willscott added a commit to filecoin-project/lotus that referenced this pull request Apr 16, 2023
…gment indices when present.

This change will result in lotus parsing deals for local indexing according to filecoin-project/FIPs#512

This is a continuation of filecoin-project/dagstore#154 - This piece of the logic was seen as filecoin-specific rather than part of common dagstore functionality.
A parallel to this logic is expected to be ported also to boost, but there remain confguration in the use of the dagstore where this is the common locations where this indexing
functionality can be registered.
willscott added a commit to filecoin-project/lotus that referenced this pull request Apr 16, 2023
…gment indices when present.

This change will result in lotus parsing deals for local indexing according to filecoin-project/FIPs#512

This is a continuation of filecoin-project/dagstore#154 - This piece of the logic was seen as filecoin-specific rather than part of common dagstore functionality.
A parallel to this logic is expected to be ported also to boost, but there remain confguration in the use of the dagstore where this is the common locations where this indexing
functionality can be registered.
willscott added a commit to filecoin-project/lotus that referenced this pull request Apr 29, 2023
…gment indices when present.

This change will result in lotus parsing deals for local indexing according to filecoin-project/FIPs#512

This is a continuation of filecoin-project/dagstore#154 - This piece of the logic was seen as filecoin-specific rather than part of common dagstore functionality.
A parallel to this logic is expected to be ported also to boost, but there remain confguration in the use of the dagstore where this is the common locations where this indexing
functionality can be registered.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants