Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split explorer and add permanent storage #2759

Open
7 tasks
ecioppettini opened this issue Nov 26, 2020 · 4 comments
Open
7 tasks

Split explorer and add permanent storage #2759

ecioppettini opened this issue Nov 26, 2020 · 4 comments
Assignees
Labels
A-explorer Area: Explorer API and backend enhancement New feature or request epic high level issues

Comments

@ecioppettini
Copy link
Contributor

Move explorer to its own binary and with a non-volatile storage

(this is a draft, I'll clean it up a bit more later)

Plan

Split the node and the explorer

  • Remove the ExplorerDB dependencies on the node. Both the Blockchain object and the Tip. Otherwise, it's not possible the move the explorer without taking the ledger too. Kind of already started with this, but it's all over the place and unfinished.
  • Move the explorer to a separate binary. (add separate explorer service #2467)
  • Add a websocket block subscription so the new service can know when it needs to fetch and apply a new block (Add a publish/subscribe api for blockchain events #2354).
  • Add some way for the explorer service to bootstrap an existing chain (or multichain). There is a really simple mechanism in place in add separate explorer service #2467 but it's probably not optimal.

Add permanent storage

  • Add some unit tests for the internals. Generate some chain, apply blocks and query (without graphql), etc. This is not really necessary, but I think it'll save time when doing the following two things.
  • Split the ExplorerDB in two parts. A stable part, in memory too, but without using immutable data structures, and an unstable part just like the way it works now with hamt's.
  • Add an implementation of the stable interface with an actual permanent database. It may be sled, sqlite, postsgres?

All of these things can be done without the separate binary, although I'd do the last one after splitting.

Current plan for the stable/unstable part (will move this to its own issue later, and add some graphics)

Pick a block confirmation depth to decide what goes to stable storage.
Queries can be satisfied by first consulting stable and unstable part and merging (probably concatenating) the results.

Mechanism: After some point, when a new block comes and it forms a longer chain, it means an old block is confirmed, and can be moved from the unstable to the stable part. The tricky part here is how to remove it from the stable part in memory, as the data structures are immutable.

Proposed plan is to apply an undo/inverse operation of the confirmed block to the incoming tip state (and to any new block that comes later with the same chain length). This means the memory is not actually released until all the blocks in the middle get dumped/garbage collected, but eventually will. And the intermediate states don't really matter because we only need to do queries at the tips.

Alternative: Don't keep the hamt-based indices

Just keep a mutable index per branch, and drop them and rebuild from the latest stable to the tips when needed, as the hamt's are not that necessary if most of the state is in stable storage. This approach seems worse, but it may be simpler and may work if forks are shallow?

Alternative: Don't keep anything in memory (I discarded these ideas, but just for completeness).

  • One way, find a database that supports cheap snapshots/clones with structural sharing. I don't think such a thing exists in a way that works for us.
  • Use some form of partition tables/trees with leading keys. I think this may be possible, but I don't think it really offers anything over the hybrid approach and it's not trivial neither.
@ecioppettini ecioppettini added enhancement New feature or request epic high level issues A-explorer Area: Explorer API and backend labels Nov 26, 2020
@ecioppettini ecioppettini self-assigned this Nov 26, 2020
@mzabaluev
Copy link
Contributor

Pick a block confirmation depth to decide what goes to stable storage.

This may need to be synchronized with the node. @eugene-babichenko is the corresponding parameter in jormungandr storage derivable from block 0?

@eugene-babichenko
Copy link
Contributor

You need to look up epoch_stability_depth.

@mzabaluev
Copy link
Contributor

You need to look up epoch_stability_depth.

Thanks. So this parameter comes from genesis, but can also be updated in a later transaction. I'm not sure how it applies across different branches, if strictly speaking its value depends on the ledger state of each branch tip.

@ecioppettini
Copy link
Contributor Author

I think epoch_stability_depth doesn't matter for different branches, as it can only change at the epoch boundary (although I need to check to be sure). In any case, yes, the explorer can use that too, although that's probably more important for the branch selection algorithm.
OTOH, I think the node is still not enforcing that rule at branch selection, so I'm actually not sure about what happens to the node if it's get broken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-explorer Area: Explorer API and backend enhancement New feature or request epic high level issues
Projects
None yet
Development

No branches or pull requests

3 participants