txpipe · cortsf · Mar 15, 2023 · Mar 15, 2023 · Mar 15, 2023 · Mar 19, 2023
@@ -8,14 +8,16 @@
     <hr/>
 </div>
 
-## Intro
+## Introduction
 
 _Scrolls_ is a tool for building and maintaining read-optimized collections of Cardano's on-chain entities. It crawls the history of the chain and aggregates all data to reflect the current state of affairs. Once the whole history has been processed, _Scrolls_ watches the tip of the chain to keep the collections up-to-date.
 
 Examples of collections are: "utxo by address", "chain parameters by epoch", "pool metadata by pool id", "tx cbor by hash", etc.
 
 > In other words, _Scrolls_ is just a map-reduce algorithm that aggregates the history of the chain into use-case-specific, key-value dictionaries.
 
+Check our [documentation](https://txpipe.github.io/scrolls) for detailed information on how to start working with Scrolls.
+
 :warning: this tool is under heavy development. Library API, configuration schema and storage structure may vary drastically. Several important features are still missing. Use at your own peril.
 
 ## Storage
@@ -101,99 +103,6 @@ Scrolls is a pipeline that takes block data as input and outputs DB update comma
   - [ ] By Mint Policy / Asset
   - [ ] By Pool
 
-## Testdrive
-
-In the `testdrive` folder you'll find a minimal example that uses docker-compose to spin up a local Redis instance and a Scrolls daemon. You'll need Docker and docker-compose installed in your local machine. Run the following commands to start it:
-
-```sh
-cd testdrive
-docker-compose up
-```
-
-You should see the logs of both _Redis_ and _Scrolls_ crawling the chain from a remote relay node. If you're familiar with Redis CLI, you can run the following commands to see the data being cached:
-
-```sh
-redis:6379> KEYS *
-1) "c1.addr1qx0w02a2ez32tzh2wveu80nyml9hd50yp0udly07u5crl6x57nfgdzya4axrl8mfx450sxpyzskkl95sx5l7hcfw59psvu6ysx"
-2) "c1.addr1qx68j552ywp6engr2s9xt7aawgpmr526krzt4mmzc8qe7p8qwjaawywglaawe74mwu726w49e8e0l9mexcwjk4kqm2tq5lmpd8"
-3) "c1.addr1q90z7ujdyyct0jhcncrpv5ypzwytd3p7t0wv93anthmzvadjcq6ss65vaupzmy59dxj43lchlra0l482rh0qrw474snsgnq3df"
-4) "c1.addr1w8vg4e5xdpad2jt0z775rt0alwmku3my2dmw8ezd884zvtssrq6fg"
-5) "c1.addr1q9tj3tdhaxqyph568h7efh6h0f078m2pxyd0xgzq47htwe3vep55nfane06hggrc2gvnpdj4gcf26kzhkd3fs874hzhszja3lh"
-6) "c1.addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz"
-redis:6379> SMEMBERS c1.addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz
-1) "2548228522837ea580bc55a3e6a09479deca499b5e7f3c08602a1f3191a178e7:20"
-2) "04086c503512833c7a0c11fc85f7d0f0422db9d14b31275b3d4327c40c6fd73b:25"
-redis:6379>
-```
-
-Once you're done with the testdive, you can clean your environment by running:
-
-```sh
-docker-compose down
-```
-
-## Installing
-
-We currently provide the following ways to install _Scrolls_:
-
-- Using one of the pre-compiled binaries shared via [Github Releases](https://github.com/txpipe/scrolls/releases)
-- Using the Docker image shared via [Github Packages](https://github.com/txpipe/scrolls/pkgs/container/scrolls)
-- By compiling from source code using the instructions provided in this README.
-
-
-## Configuration
-
-This is an example configuration file:
-
-```toml
-# get data from a relay node
-[source]
-type = "N2N"
-address = "relays-new.cardano-mainnet.iohk.io:3001"
-
-# You can optionally enable enrichment (local db with transactions), this is needed for some reducers
-[enrich]
-type = "Sled"
-db_path = "/opt/scrolls/sled_db"
-
-# enable the "UTXO by Address" collection
-[[reducers]]
-type = "UtxoByAddress"
-# you can optionally prefix the keys in the collection
-key_prefix = "c1"
-# you can optionally only process UTXO from a set of predetermined addresses
-filter = ["addr1qy8jecz3nal788f8t2zy6vj2l9ply3trpnkn2xuvv5rgu4m7y853av2nt8wc33agu3kuakvg0kaee0tfqhgelh2eeyyqgxmxw3"]
-
-# enable the "Point by Tx" collection
-[[reducers]]
-type = "PointByTx"
-key_prefix = "c2"
-
-# store the collections in a local Redis
-[storage]
-type = "Redis"
-connection_params = "redis://127.0.0.1:6379"
-
-# start reading from an arbitrary point in the chain
-[intersect]
-type = "Point"
-value = [57867490, "c491c5006192de2c55a95fb3544f60b96bd1665accaf2dfa2ab12fc7191f016b"]
-
-# let Scrolls know that we're working with mainnet
-[chain]
-type = "Mainnet"
-```
-
-## Compiling from Source
-
-To compile from source, you'll need to have the Rust toolchain available in your development box. Execute the following command to clone and build the project:
-
-```sh
-git clone https://github.com/txpipe/scrolls.git
-cd scrolls
-cargo build
-```
-
 ## FAQ
 
 ### Don't we have tools for this already?
@@ -211,24 +120,6 @@ Yes, we do. We have excellent tools such as: [Kupo](https://github.com/CardanoSo
 There's some overlap between _Oura_ and _Scrolls_. Both tools read on-chain data and output some data results. The main difference is that Oura is meant to **_react_** to events, to watch the chain and actuate upon certain patterns. In contrast, _Scrolls_ is meant to provide a snapshot of the current state of the chain by aggregating the whole history.
 
 They were built to work well together. For example, let's say that you're building an app that uses Oura to process transaction data, you could then integrate _Scrolls_ as a way to lookup the source address of the transaction's input.
-
-### How do I read the data using Python?
-
-Assuming you're using Redis as a storage backend (only one available ATM), we recommend using [redis-py](https://github.com/redis/redis-py) package to talk directly to the Redis instance. This is a very simple code snippet to query a the UTXOs by address.
-
-```python
->>> import redis
->>> r = redis.Redis(host='localhost', port=6379, db=0)
->>> r.smembers("c1.addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz")
-{b'2548228522837ea580bc55a3e6a09479deca499b5e7f3c08602a1f3191a178e7:20', b'04086c503512833c7a0c11fc85f7d0f0422db9d14b31275b3d4327c40c6fd73b:25'}
-```
-
- The Redis operation being used is `smembers` which return the list of members of a set stored under a particular key. In this case, we query by the value `c1.addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz`, where `c1` is the key prefix specified in the config for our particular collection and `addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz` is the address we're interested in querying. The response from redis is the list of UTXOs (in the format `{tx-hash}:{output-index}`) that are associated with that particular address.
-
-### How do I read the data using NodeJS?
-
-TODO
-
 ### What is "swarm mode"?
 
 Swarm mode is a way to speed up the process of rebuilding collection from scratch by splitting the tasks into concurrent instances of the _Scrolls_ daemon by partitioning the history of the chain into smaller fragments.

@@ -1,3 +1,26 @@
 # Summary
 
 - [Introduction](./introduction.md)
+- [Installation](./installation/README.md)
+    - [Binary Release](./installation/binary_release.md)
+    - [From Source](./installation/from_source.md)
+    - [Docker](./installation/docker.md)
+- [Usage](./usage/README.md)
+- [Configuration](./configuration/README.md)
+    - [Sources](./configuration/sources.md)
+    - [Reducers](./configuration/reducers/README.md)
+        - [Predicates](./configuration/reducers/predicates.md)
+    - [Storage](./configuration/storage.md)
+    - [Enrich](./configuration/enrich.md)
+    - [Intersect](./configuration/intersect.md)
+    - [Chain](./configuration/chain.md)
+    - [Policy](./configuration/policy.md)
+- [Advanced Features](./advanced/README.md)
+    - [Swarm Mode](./advanced/swarm_mode.md)
+- [Troubleshooting](./troubleshooting/README.md)
+- [Guides](./guides/README.md)
+    - [Testdrive](./guides/testdrive.md)
+    - [NodeJS Client](./guides/nodejs.md)
+    - [Python Client](./guides/python.md)
+    - [Redis-cli](./guides/redis.md)
+
@@ -0,0 +1 @@
+# Advanced features
@@ -0,0 +1,4 @@
+# Swarm mode
+
+Swarm mode is a way to speed up the process of rebuilding collection from scratch by splitting the tasks into concurrent instances of the Scrolls daemon by partitioning the history of the chain into smaller fragments.
+
@@ -0,0 +1,48 @@
+# Configuration
+For the purpose of testing out Scrolls you can use the provided configuration located in `testdrive/simple/daemon.toml`. See below for another example with explanations and check the following sections of this book to understand in detail each section of the configuration file.
+
+## Format
+Scrolls daemon supports `.toml` and `.json` configuration files. Unlike json, toml supports comments which are very handy to turn declarations on and off, specially during early stages of development, debugging, learning, etc. On the other hand, deeply nested filters can be difficult to understand using toml syntax, so the user can choose to declare the whole configuration with json, or instead to rely on tools like [toml2json](https://github.com/woodruffw/toml2json) and [remarshal](https://github.com/remarshal-project/remarshal) to translate small chunks of json (such as complex deeply nested filters) to be used in toml configuration files.
+
+When working with toml configuration files, sometimes it also helps to translate the whole configuration to json, and use [jq](https://stedolan.github.io/jq/)/[bat](https://github.com/sharkdp/bat) to make the json human friendly. This often helps to understand the structure of the filters. Example: `toml2json ./configuration.toml | jq | bat -l json` 
+
+## Configuration Example
+```toml
+# get data from a relay node
+[source]
+type = "N2N"
+address = "relays-new.cardano-mainnet.iohk.io:3001"
+
+# You can optionally enable enrichment (local db with transactions), this is needed for some reducers
+[enrich]
+type = "Sled"
+db_path = "/opt/scrolls/sled_db"
+
+# enable the "UTXO by Address" collection
+[[reducers]]
+type = "UtxoByAddress"
+# you can optionally prefix the keys in the collection
+key_prefix = "c1"
+# you can optionally only process UTXO from a set of predetermined addresses
+filter = ["addr1qy8jecz3nal788f8t2zy6vj2l9ply3trpnkn2xuvv5rgu4m7y853av2nt8wc33agu3kuakvg0kaee0tfqhgelh2eeyyqgxmxw3"]
+
+# enable the "Point by Tx" collection
+[[reducers]]
+type = "PointByTx"
+key_prefix = "c2"
+
+# store the collections in a local Redis
+[storage]
+type = "Redis"
+connection_params = "redis://127.0.0.1:6379"
+
+# start reading from an arbitrary point in the chain
+[intersect]
+type = "Point"
+value = [57867490, "c491c5006192de2c55a95fb3544f60b96bd1665accaf2dfa2ab12fc7191f016b"]
+
+# let Scrolls know that we're working with mainnet
+[chain]
+type = "Mainnet"
+```
+
@@ -0,0 +1,50 @@
+# Chain
+
+Specify which network to fetch data from.
+
+## Fields
+- type: `"Mainnet" | "Testnet" | "PreProd" | "Preview" | "Custom"`
+- magic (*): `u64`,
+- byron_epoch_length (*): `u32`,
+- byron_slot_length (*): `u32`,
+- byron_known_slot (*): `u64`,
+- byron_known_hash (*): `String`,
+- byron_known_time (*): `u64`,
+- shelley_epoch_length (*): `u32`,
+- shelley_slot_length (*): `u32`,
+- shelley_known_slot (*): `u64`,
+- shelley_known_hash (*): `String`,
+- shelley_known_time (*): `u64`,
+- address_network_id (*): `u8`,
+- adahandle_policy (*): `String`,
+
+
+(*) Use only with `type = "Custom"`
+
+## Examples
+
+Using mainnet
+``` toml
+[chain]
+type = "Mainnet"
+```
+
+Using custom values (mainnet): 
+``` toml
+[chain]
+type = "Custom"
+magic = 764824073
+byron_epoch_length = 432000
+byron_slot_length = 20
+byron_known_slot = 0
+byron_known_time = 1506203091
+byron_known_hash = "f0f7892b5c333cffc4b3c4344de48af4cc63f55e44936196f365a9ef2244134f"
+shelley_epoch_length = 432000
+shelley_slot_length = 1
+shelley_known_slot = 4492800
+shelley_known_hash = "aa83acbf5904c0edfe4d79b3689d3d00fcfc553cf360fd2229b98d464c28e9de"
+shelley_known_time = 1596059091
+address_network_id = 1
+adahandle_policy = "f0ff48bbb7bbe9d59a40f1ce90e9e9d0ff5002ec48f232b49ca0fb9a"
+
+```
@@ -0,0 +1,16 @@
+# Enrich
+Store utxo information in a local DB, this is needed for some reducers to work. Currently, only [Sled](https://github.com/spacejam/sled) databases are supported.
+
+## Fields
+- type: `"Sled" | "Skip"`
+- db_path (*): `String`
+
+(*) Use only with `type = "Sled"`
+
+## Example
+
+``` toml
+[enrich]
+type = "Sled"
+db_path = "/opt/scrolls/sled_db"
+```
@@ -0,0 +1,34 @@
+# Intersect
+
+Scrolls provides 4 different strategies for finding the intersection point within the chain sync process.
+
+- `Origin`: Scrolls will start reading from the beginning of the chain.
+- `Tip`: Scrolls will start reading from the current tip of the chain.
+- `Point`: Scrolls will start reading from a particular point (slot, hash) in the chain. If the point is not found, the process will be terminated with a non-zero exit code.
+- `Fallbacks`: Scrolls will start reading the first valid point within a set of alternative positions. If point is not valid, the process will fallback into the next available point in the list of options. If none of the points are valid, the process will be terminated with a non-zero exit code.
+
+
+## Fields
+- type: `"Tip" | "Origin" | "Point" | "Fallbacks"`
+- value (*): `(u64, String) | Vec<(u64, String)>`
+
+(*) Use value of type `(u64, String)` with `type = "Point"` and value of type `Vec<(u64, String)>` with `type = "Fallbacks"`
+
+## Examples
+
+Using **Point**:
+``` toml
+[intersect]
+type = "Point"
+value = [57867490, "c491c5006192de2c55a95fb3544f60b96bd1665accaf2dfa2ab12fc7191f016b"]
+```
+
+Using **Fallbacks**:
+``` toml
+[intersect]
+type = "Fallbacks"
+value = [
+      [12345678, "this_is_not_a_valid_hash_ff1b93cdfd997d4ea93e7d930908aa5905d788f"],
+      [57867490, "c491c5006192de2c55a95fb3544f60b96bd1665accaf2dfa2ab12fc7191f016b"]
+      ]
+```
@@ -0,0 +1,17 @@
+# Policy
+
+## Fields
+- missing_data: `"Skip" | "Warn" | "Default"`
+- cbor_errors: `"Skip" | "Warn" | "Default"`
+- ledger_errors: `"Skip" | "Warn" | "Default"`
+- any_error: `"Skip" | "Warn" | "Default"`
+
+
+## Example
+
+``` toml
+[policy]
+cbor_errors = "Skip"
+missing_data = "Warn"
+```
+