Split explorer and add permanent storage #2759

ecioppettini · 2020-11-26T19:18:44Z

Move explorer to its own binary and with a non-volatile storage

(this is a draft, I'll clean it up a bit more later)

Plan

Split the node and the explorer

Remove the ExplorerDB dependencies on the node. Both the Blockchain object and the Tip. Otherwise, it's not possible the move the explorer without taking the ledger too. Kind of already started with this, but it's all over the place and unfinished.
Move the explorer to a separate binary. (add separate explorer service #2467)
Add a websocket block subscription so the new service can know when it needs to fetch and apply a new block (Add a publish/subscribe api for blockchain events #2354).
Add some way for the explorer service to bootstrap an existing chain (or multichain). There is a really simple mechanism in place in add separate explorer service #2467 but it's probably not optimal.

Add permanent storage

Add some unit tests for the internals. Generate some chain, apply blocks and query (without graphql), etc. This is not really necessary, but I think it'll save time when doing the following two things.
Split the ExplorerDB in two parts. A stable part, in memory too, but without using immutable data structures, and an unstable part just like the way it works now with hamt's.
Add an implementation of the stable interface with an actual permanent database. It may be sled, sqlite, postsgres?

All of these things can be done without the separate binary, although I'd do the last one after splitting.

Current plan for the stable/unstable part (will move this to its own issue later, and add some graphics)

Pick a block confirmation depth to decide what goes to stable storage.
Queries can be satisfied by first consulting stable and unstable part and merging (probably concatenating) the results.

Mechanism: After some point, when a new block comes and it forms a longer chain, it means an old block is confirmed, and can be moved from the unstable to the stable part. The tricky part here is how to remove it from the stable part in memory, as the data structures are immutable.

Proposed plan is to apply an undo/inverse operation of the confirmed block to the incoming tip state (and to any new block that comes later with the same chain length). This means the memory is not actually released until all the blocks in the middle get dumped/garbage collected, but eventually will. And the intermediate states don't really matter because we only need to do queries at the tips.

Alternative: Don't keep the hamt-based indices

Just keep a mutable index per branch, and drop them and rebuild from the latest stable to the tips when needed, as the hamt's are not that necessary if most of the state is in stable storage. This approach seems worse, but it may be simpler and may work if forks are shallow?

Alternative: Don't keep anything in memory (I discarded these ideas, but just for completeness).

One way, find a database that supports cheap snapshots/clones with structural sharing. I don't think such a thing exists in a way that works for us.
Use some form of partition tables/trees with leading keys. I think this may be possible, but I don't think it really offers anything over the hybrid approach and it's not trivial neither.

The text was updated successfully, but these errors were encountered:

mzabaluev · 2020-11-30T14:15:05Z

Pick a block confirmation depth to decide what goes to stable storage.

This may need to be synchronized with the node. @eugene-babichenko is the corresponding parameter in jormungandr storage derivable from block 0?

eugene-babichenko · 2020-11-30T14:52:11Z

You need to look up epoch_stability_depth.

mzabaluev · 2020-11-30T15:52:09Z

You need to look up epoch_stability_depth.

Thanks. So this parameter comes from genesis, but can also be updated in a later transaction. I'm not sure how it applies across different branches, if strictly speaking its value depends on the ledger state of each branch tip.

ecioppettini · 2020-12-01T20:05:37Z

I think epoch_stability_depth doesn't matter for different branches, as it can only change at the epoch boundary (although I need to check to be sure). In any case, yes, the explorer can use that too, although that's probably more important for the branch selection algorithm.
OTOH, I think the node is still not enforcing that rule at branch selection, so I'm actually not sure about what happens to the node if it's get broken.

ecioppettini added enhancement New feature or request epic high level issues A-explorer Area: Explorer API and backend labels Nov 26, 2020

ecioppettini self-assigned this Nov 26, 2020

ecioppettini mentioned this issue Apr 10, 2021

Split explorer storage in stable and unstable #3205

Closed

ecioppettini mentioned this issue Apr 19, 2021

Move explorer to it's own binary crate #3227

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split explorer and add permanent storage #2759

Split explorer and add permanent storage #2759

ecioppettini commented Nov 26, 2020

mzabaluev commented Nov 30, 2020

eugene-babichenko commented Nov 30, 2020

mzabaluev commented Nov 30, 2020

ecioppettini commented Dec 1, 2020

Split explorer and add permanent storage #2759

Split explorer and add permanent storage #2759

Comments

ecioppettini commented Nov 26, 2020

Move explorer to its own binary and with a non-volatile storage

Plan

Split the node and the explorer

Add permanent storage

Current plan for the stable/unstable part (will move this to its own issue later, and add some graphics)

Alternative: Don't keep the hamt-based indices

Alternative: Don't keep anything in memory (I discarded these ideas, but just for completeness).

mzabaluev commented Nov 30, 2020

eugene-babichenko commented Nov 30, 2020

mzabaluev commented Nov 30, 2020

ecioppettini commented Dec 1, 2020