Skip to content

Commit

Permalink
[README] Final Tweaks (#278)
Browse files Browse the repository at this point in the history
Co-authored-by: xinifinity <[email protected]>
  • Loading branch information
patrick-ogrady and xinifinity authored Sep 26, 2023
1 parent 74a7fde commit f1254e4
Show file tree
Hide file tree
Showing 3 changed files with 67 additions and 70 deletions.
61 changes: 30 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,30 @@
# Firewood: non-archival blockchain key-value store with hyper-fast recent state retrieval.
# Firewood: Compaction-Less Database Optimized for Efficiently Storing Recent Merkleized Blockchain State

![Github Actions](https://github.com/ava-labs/firewood/actions/workflows/ci.yaml/badge.svg?branch=main)
[![Ecosystem license](https://img.shields.io/badge/License-Ecosystem-blue.svg)](./LICENSE.md)

> :warning: firewood is alpha-level software and is not ready for production
> use. Do not use firewood to store production data. See the
> [license](./LICENSE.md) for more information regarding firewood usage.
Firewood is an embedded key-value store, optimized to store blockchain state.
It prioritizes access to latest state, by providing extremely fast reads, but
also provides a limited view into past state. It does not copy-on-write the
state trie to generate an ever growing forest of tries like other databases,
but instead keeps one latest version of the trie index on disk and apply
in-place updates to it. This ensures that the database size is small and stable
during the course of running firewood. Firewood was first conceived to provide
> :warning: Firewood is alpha-level software and is not ready for production
> use. The Firewood API and on-disk state representation may change with
> little to no warning.
Firewood is an embedded key-value store, optimized to store recent Merkleized blockchain
state with minimal overhead. Firewood is implemented from the ground up to directly
store trie nodes on-disk. Unlike most of state management approaches in the field,
it is not built on top of a generic KV store such as LevelDB/RocksDB. Firewood, like a
B+-tree based database, directly uses the trie structure as the index on-disk. Thus,
there is no additional “emulation” of the logical trie to flatten out the data structure
to feed into the underlying database that is unaware of the data being stored. The convenient
byproduct of this approach is that iteration is still fast (for serving state sync queries)
but compaction is not required to maintain the index. Firewood was first conceived to provide
a very fast storage layer for the EVM but could be used on any blockchain that
requires authenticated state.

Firewood is a robust database implemented from the ground up to directly store
trie nodes and user data. Unlike most (if not all) of the solutions in the field,
it is not built on top of a generic KV store such as LevelDB/RocksDB. Like a
B+-tree based store, firewood directly uses the tree structure as the index on
disk. Thus, there is no additional “emulation” of the logical trie to flatten
out the data structure to feed into the underlying DB that is unaware of the
data being stored. It provides generic trie storage for arbitrary keys and
values.
Firewood only attempts to store the latest state on-disk and will actively clean up
unused state when state diffs are committed. To avoid reference counting trie nodes,
Firewood does not copy-on-write (COW) the state trie and instead keeps
one latest version of the trie index on disk and applies in-place updates to it.
Firewood keeps some configurable number of previous states in memory to power
state sync (which may occur at a few roots behind the current state).

Firewood provides OS-level crash recovery via a write-ahead log (WAL). The WAL
guarantees atomicity and durability in the database, but also offers
Expand All @@ -34,13 +34,9 @@ store back in memory. While running the store, new changes will also contribute
to the configured window of changes (at batch granularity) to access any past
versions with no additional cost at all.

## License
firewood is licensed by the Ecosystem License. For more information, see the
[LICENSE file](./LICENSE.md).

## Architecture Diagram

![architecture diagram](./docs/assets/architecture.svg)<img src="./docs/assets/architecture.svg">
![architecture diagram](./docs/assets/architecture.svg)

## Termimology

Expand Down Expand Up @@ -71,12 +67,11 @@ firewood is licensed by the Ecosystem License. For more information, see the
* `Batch Operation` - An operation of either `Put` or `Delete`.
* `Batch` - An ordered set of `Batch Operation`s.
* `Proposal` - A proposal consists of a base `Root Hash` and a `Batch`, but is not
yet committed to the trie. In firewood's most recent API, a `Proposal` is required
yet committed to the trie. In Firewood's most recent API, a `Proposal` is required
to `Commit`.
* `Commit` - The operation of applying one or more `Proposal`s to the most recent
`Revision`.


## Roadmap

**LEGEND**
Expand Down Expand Up @@ -124,7 +119,7 @@ corresponding range proofs that verify the correctness of the data.
- [ ] Enforce limits on the size of the range proof as well as keys to make
synchronization easier for clients.
- [ ] MerkleDB root hash in parity for seamless transition between MerkleDB
and firewood.
and Firewood.
- [ ] Add metric reporting
- [ ] Migrate to a fully async interface, consider tokio\_uring, monoio, etc
- [ ] Refactor `Shale` to be more idiomatic, consider rearchitecting it
Expand All @@ -133,7 +128,7 @@ and firewood.
Firewood currently is Linux-only, as it has a dependency on the asynchronous
I/O provided by the Linux kernel (see `libaio`). Unfortunately, Docker is not
able to successfully emulate the syscalls `libaio` relies on, so Linux or a
Linux VM must be used to run firewood. We intend to migrate to io\_uring which
Linux VM must be used to run Firewood. We intend to migrate to io\_uring which
should allow for this emulation.

## Run
Expand All @@ -142,12 +137,16 @@ use-cases. Try running them via the command-line, via `cargo run --release
--example simple`.

## Release
See the [release documentation](./RELEASE.md) for detailed information on how to release firewood.
See the [release documentation](./RELEASE.md) for detailed information on how to release Firewood.

## CLI
Firewood comes with a CLI tool called `fwdctl` that enables one to create and interact with a local instance of a firewood database. For more information, see the [fwdctl README](fwdctl/README.md).
Firewood comes with a CLI tool called `fwdctl` that enables one to create and interact with a local instance of a Firewood database. For more information, see the [fwdctl README](fwdctl/README.md).

## Test
```
cargo test --release
```

## License
Firewood is licensed by the Ecosystem License. For more information, see the
[LICENSE file](./LICENSE.md).
2 changes: 1 addition & 1 deletion docs/assets/architecture.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 36 additions & 38 deletions firewood/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,28 +1,33 @@
// Copyright (C) 2023, Ava Labs, Inc. All rights reserved.
// See the file LICENSE.md for licensing terms.

//! # Firewood: non-archival blockchain key-value store with hyper-fast recent state retrieval.
//!
//! Firewood is an embedded key-value store, optimized to store blockchain state. It prioritizes
//! access to latest state, by providing extremely fast reads, but also provides a limited view
//! into past state. It does not copy-on-write the state trie to generate an ever
//! growing forest of tries like other databases, but instead keeps one latest version of the trie index on disk
//! and apply in-place updates to it. This ensures that the database size is small and stable
//! during the course of running Firewood. Firewood was first conceived to provide a very fast
//! storage layer for the EVM but could be used on any blockchain that requires authenticated state.
//!
//! Firewood is a robust database implemented from the ground up to directly store trie nodes and
//! user data. Unlike most (if not all) of the solutions in the field, it is not built on top of a
//! generic KV store such as LevelDB/RocksDB. Like a B+-tree based store, Firewood directly uses
//! the tree structure as the index on disk. Thus, there is no additional "emulation" of the
//! logical trie to flatten out the data structure to feed into the underlying DB that is unaware
//! of the data being stored. It provides generic trie storage for arbitrary keys and values.
//!
//! Firewood provides OS-level crash recovery via a write-ahead log (WAL). The WAL guarantees
//! atomicity and durability in the database, but also offers "reversibility": some portion
//! of the old WAL can be optionally kept around to allow a fast in-memory rollback to recover
//! some past versions of the entire store back in memory. While running the store, new changes
//! will also contribute to the configured window of changes (at batch granularity) to access any past
//! #Firewood: Compaction-Less Database Optimized for Efficiently Storing Recent Merkleized Blockchain State
//!
//! Firewood is an embedded key-value store, optimized to store recent Merkleized blockchain
//! state with minimal overhead. Firewood is implemented from the ground up to directly
//! store trie nodes on-disk. Unlike most of state management approaches in the field,
//! it is not built on top of a generic KV store such as LevelDB/RocksDB. Firewood, like a
//! B+-tree based database, directly uses the trie structure as the index on-disk. Thus,
//! there is no additional “emulation” of the logical trie to flatten out the data structure
//! to feed into the underlying database that is unaware of the data being stored. The convenient
//! byproduct of this approach is that iteration is still fast (for serving state sync queries)
//! but compaction is not required to maintain the index. Firewood was first conceived to provide
//! a very fast storage layer for the EVM but could be used on any blockchain that
//! requires authenticated state.
//!
//! Firewood only attempts to store the latest state on-disk and will actively clean up
//! unused state when state diffs are committed. To avoid reference counting trie nodes,
//! Firewood does not copy-on-write (COW) the state trie and instead keeps
//! one latest version of the trie index on disk and applies in-place updates to it.
//! Firewood keeps some configurable number of previous states in memory to power
//! state sync (which may occur at a few roots behind the current state).
//!
//! Firewood provides OS-level crash recovery via a write-ahead log (WAL). The WAL
//! guarantees atomicity and durability in the database, but also offers
//! “reversibility”: some portion of the old WAL can be optionally kept around to
//! allow a fast in-memory rollback to recover some past versions of the entire
//! store back in memory. While running the store, new changes will also contribute
//! to the configured window of changes (at batch granularity) to access any past
//! versions with no additional cost at all.
//!
//! # Design Philosophy & Overview
Expand All @@ -38,7 +43,7 @@
//! well-executed plan for this is to make sure the performance degradation is reasonable or
//! well-contained with respect to the ever-increasing size of the index. This design is useful
//! for nodes which serve as the backend for some indexing service (e.g., chain explorer) or as a
//! query portal to some user agent (e.g., wallet apps). Blockchains with poor finality may also
//! query portal to some user agent (e.g., wallet apps). Blockchains with delayed finality may also
//! need this because the "canonical" branch of the chain could switch (but not necessarily a
//! practical concern nowadays) to a different fork at times.
//!
Expand All @@ -64,11 +69,10 @@
//! Firewood is built by three layers of abstractions that totally decouple the
//! layout/representation of the data on disk from the actual logical data structure it retains:
//!
//! - Linear, memory-like space: the [shale](https://crates.io/crates/shale) crate from an academic
//! project (CedrusDB) code offers a `CachedStore` abstraction for a (64-bit) byte-addressable space
//! that abstracts away the intricate method that actually persists the in-memory data on the
//! secondary storage medium (e.g., hard drive). The implementor of `CachedStore` will provide the
//! functions to give the user of `CachedStore` an illusion that the user is operating upon a
//! - Linear, memory-like space: the `shale` crate offers a `CachedStore` abstraction for a
//! (64-bit) byte-addressable space that abstracts away the intricate method that actually persists
//! the in-memory data on the secondary storage medium (e.g., hard drive). The implementor of `CachedStore`
//! provides the functions to give the user of `CachedStore` an illusion that the user is operating upon a
//! byte-addressable memory space. It is just a "magical" array of bytes one can view and change
//! that is mirrored to the disk. In reality, the linear space will be chunked into files under a
//! directory, but the user does not have to even know about this.
Expand All @@ -84,12 +88,6 @@
//! persisted on disk. It is as if they're just in memory, which makes it much easier to write
//! and maintain the code.
//!
//! The three layers are depicted as follows:
//!
//! <p align="center">
//! <img src="https://ava-labs.github.io/firewood/assets/three-layers.svg" width="80%">
//! </p>
//!
//! Given the abstraction, one can easily realize the fact that the actual data that affect the
//! state of the data structure (trie) is what the linear space (`CachedStore`) keeps track of, that is,
//! a flat but conceptually large byte vector. In other words, given a valid byte vector as the
Expand All @@ -114,10 +112,10 @@
//! dirty pages induced by this write batch are taken out from the linear space. Although they are
//! mathematically equivalent, interval writes are more compact than pages (which are 4K in size,
//! become dirty even if a single byte is touched upon) . So interval writes are fed into the WAL
//! subsystem (supported by [growthring](https://crates.io/crates/growth-ring)). After the
//! WAL record is written (one record per write batch), the dirty pages are then pushed to the
//! on-disk linear space to mirror the change by some asynchronous, out-of-order file writes. See
//! the `BufferCmd::WriteBatch` part of `DiskBuffer::process` for the detailed logic.
//! subsystem (supported by growthring). After the WAL record is written (one record per write batch),
//! the dirty pages are then pushed to the on-disk linear space to mirror the change by some
//! asynchronous, out-of-order file writes. See the `BufferCmd::WriteBatch` part of `DiskBuffer::process`
//! for the detailed logic.
//!
//! In short, a Read-Modify-Write (RMW) style normal operation flow is as follows in Firewood:
//!
Expand Down

0 comments on commit f1254e4

Please sign in to comment.