The format is based on Keep a Changelog, and this project adheres to Semantic Versioning. See MAINTAINERS.md for instructions to keep up to date.
- Adding nil safety check on the
CombinedFilter
and looping over the transaction_trace receipts - Bump
substreams
anddmetering
to latest version adding theoutputModuleHash
to metering sender.
Note All caches for stores using the updatePolicy
set_sum
(added in substreams v1.7.0) and modules that depend on them will need to be deleted, since they may contain bad data.
- Fix bad data in stores using
set_sum
policy: squashing of store segments incorrectly "summed" some values that should have been "set" if the last event for a key on this segment was a "sum" - Fix small bug making some requests in development-mode slow to start (when starting close to the module initialBlock with a store that doesn't start on a boundary)
- Fixed an(other) issue where multiple stores running on the same stage with different initialBlocks will fail to proress (and hang)
- Fix bug where some invalid cursors may be sent (with 'LIB' being above the block being sent) and add safeguard/loggin if the bug appears again
- Fix panic in the whole tier2 process when stores go above the size limit while being read from "kvops" cached changes
- Fix "cannot resolve 'old cursor' from files in passthrough mode" error on some requests with an old cursor
- Fix handling of 'special case' substreams module with only "params" as its input: should not skip this execution (used in graph-node for head tracking)
-> empty files in module cache with hash
d3b1920483180cbcd2fd10abcabbee431146f4c8
should be deleted for consistency
- [Operator] The flag
--advertise-block-id-encoding
now accepts shorter form:hex
,base64
, etc. The older longer formBLOCK_ID_ENCODING_HEX
is still supported but we suggested using the shorter form from now on.
Note Since a bug that affected substreams with "skipping blocks" was corrected in this release, any previously produced substreams cache should be considered as possibly corrupted and be eventually replaced
- Substreams: fix bad handling of modules with multiple inputs when only one of them is filtered, resulting in bad outputs in production-mode.
- Substreams: fix stalling on some substreams with stores and mappers with different start block numbers on the same stage
- Substreams: fix 'development mode' and LIVE mode executing some modules that should be skipped
- Bump substreams to v1.10.0
- Bump firehose-core to v1.6.1
-
Add
sf.firehose.v2.EndpointInfo/Info
service on Firehose andsf.substreams.rpc.v2.EndpointInfo/Info
to Substreams endpoints. This involves the following new flags:advertise-chain-name
Canonical name of the chain according to https://thegraph.com/docs/en/developing/supported-networks/ (required, unless it is in the "well-known" list)advertise-chain-aliases
Alternate names for that chain (optional)advertise-block-features
List of features describing the blocks (optional)ignore-advertise-validation
Runtime checks of chain name/features/encoding against the genesis block will no longer cause server to wait or fail.
-
Add a well-known list of chains (hard-coded in
wellknown/chains.go
to help automatically determine the 'advertise' flag values). Users are encouraged to propose Pull Requests to add more chains to the list. -
The new info endpoint adds a mandatory fetching of the first streamable block on startup, with a failure if no block can be fetched after 3 minutes and you are running
firehose
orsubstreams-tier1
service. It validates the following on a well-known chain:- if the first-streamable-block Num/ID match the genesis block of a known chain, e.g.
matic
, it will refuse another value foradvertise-chain-name
thanmatic
or one of its aliases (polygon
) - If the first-streamable-block does not match any known chain, it will require the
advertise-chain-name
to be non-empty
- if the first-streamable-block Num/ID match the genesis block of a known chain, e.g.
-
Substreams: add
--common-tmp-dir
flag to activate local caching of pre-compiled WASM modules through wazero v1.8.0 feature (performance improvement on WASM compilation) -
Substreams: revert module hash calculation from
v2.6.5
, when using a non-zero firstStreamableBlock. Hashes will now be the same even if the chain's first streamable block affects the initialBlock of a module. -
Substreams: add
--substreams-block-execution-timeout
flag (default 3 minutes) to prevent requests stalling. Timeout errors are returned to the client who can decide to retry.
- Bump substreams to v1.9.3: fix high CPU usage on tier1 caused by a bad error handling
- Bump substreams to v1.9.2: Prevent Noop handler from sending outputs with 'Stalled' step in cursor (which breaks substreams-sink-kv)
- Bump firehose-core to v1.5.6: add
--reader-node-line-buffer-size
flag and bump default value from 100M to 200M to go over crazy block 278208000 on Solana
- Fixed a bug in the blockfetcher which could cause transactions receipts to be nil
- Fixed a bug in substreams where chains with non-zero first-streamable-block would cause some substreams to hang. Solution changes the 'cached' hashes for those substreams.
- Fix a bug introduced in v1.6.0 that could result in corrupted store "state" file if all the "outputs" were already cached for a module in a given segment (rare occurence)
- We recommend clearing your substreams cache after this upgrade and re-processing or validating your data if you use stores.
- Expose a new intrinsic to modules:
skip_empty_output
, which causes the module output to be skipped if it has zero bytes. (Watch out, a protobuf object with all its default values will have zero bytes) - Improve schedule order (faster time to first block) for substreams with multiple stages when starting mid-chain
- fix "hub" not recovering on certain disconnections in relayer/firehose/substreams (scenarios requiring full restart)
- Bumped firehose-core to v1.5.2 and substreams v1.8.0
- Added substreams back-filler to populate cache for live requests when the blocks become final
- Fixed: truncate very long details on error messages to prevent them from disappearing when behind a (misbehaving) load-balancer
-
Bumped firehose-core v1.5.1 and substreams v1.7.3
-
Bootstrapping from live blocks improved for chains with very slow blocks or with very fast blocks (affects relayer, firehose and substreams tier1)
-
Substreams fixed slow response close to HEAD in production-mode
- Substreams engine is now able run Rust code that depends on
solana_program
in Solana land to decode andalloy/ether-rs
in Ethereum land
Those libraries when used in a wasm32-unknown-unknown
context creates in a bunch of wasmbindgen imports in the resulting Substreams Rust code, imports that led to runtime errors because Substreams engine didn't know about those special imports until today.
The Substreams engine is now able to "shims" those wasmbindgen
imports enabling you to run code that depends libraries like solana_program
and alloy/ether-rs
which are known to pull those wasmbindgen
imports. This is going to work as long as you do not actually call those special imports. Normal usage of those libraries don't accidentally call those methods normally. If they are called, the WASM module will fail at runtime and stall the Substreams module from going forward.
To enable this feature, you need to explicitly opt-in by appending a +wasm-bindgen-shims
at the end of the binary's type in your Substreams manifest:
binaries:
default:
type: wasm/rust-v1
file: <some_file>
to become
binaries:
default:
type: wasm/rust-v1+wasm-bindgen-shims
file: <some_file>
-
Substreams clients now enable gzip compression over the network (already supported by servers).
-
Substreams binary type can now be optionally composed of runtime extensions by appending a
+<extension>,[<extesions...>]
at the end of the binary type. Extensions arekey[=value]
that are runtime specifics.[!NOTE] If you were a library author and parsing generic Substreams manifest(s), you will now need to handle that possibility in the binary type. If you were reading the field without any processing, you don't have to change nothing.
- bump firehose-core to v1.4.2
- execout: preload only one file instead of two, log if undeleted caches found
- execout: add environment variable SUBSTREAMS_DISABLE_PRELOAD_EXEC_FILES to disable file preloading
- Revert sanity check to support the special case of a substreams with only 'params' as input. This allows a chain-agnostic event to be sent, along with the clock.
- Fix error handling when resolved start-block == stop-block and stop-block is defined as non-zero
Note Upgrading will require changing the tier1 and tier2 versions concurrently, as the internal protocol has changed.
- Index Modules and Block Filter now supported. See https://github.com/streamingfast/substreams-foundational-modules for an example implementation
- Various scheduling and performance improvements
- env variable
SUBSTREAMS_WORKERS_RAMPUP_TIME
changed from4s
to0
. Set it to4s
to keep previous behavior otelcol://
tracing protocol no longer supported
-
Fixed a crash when
eth_call
batch is of length 0 and a retry is attempted. -
Allow stores to write to stores with out-of-order ordinals (they will be reordered at the end of the module execution for each block)
-
Fix issue in substreams-tier2 causing some files to be written to the wrong place sometimes under load, resulting in some hanging requests
-
The
fireeth tools download-from-firehose
now respects its documentation when doing--help
, correct invocation now isfireeth tools download-from-firehose <endpoint> <start>:<end> <output_folder>
. -
The
fireeth tools download-from-firehose
has been improved to work with new Firehosesf.firehose.v2.BlockMetadata
field, if the server sends this new field, the tool is going to work on any chain. If the server's you are reaching is not recent enough, the tool fallbacks to the previous logic. All StreamingFast endpoints should serves be compatible. -
Firehose response (both single block and stream) now include the
sf.firehose.v2.BlockMetadata
field. This new field contains the chain agnostic fields we hold about any block of any chain.
- bump substreams to v1.5.5 with fix in wazero to prevent process freezing on certain substreams
- Added support for Firehose reader format 2.5 which will be required for
BSC 1.4.5+
.
- Updated block model to add
BalanceChange#Reason.REWARD_BLOB_FEE
for BSC Tycho hard-fork.
- fix a possible panic() when an request is interrupted during the file loading phase of a squashing operation.
- fix a rare possibility of stalling if only some fullkv stores caches were deleted, but further segments were still present.
- fix stats counters for store operations time
- fix memory leak on substreams execution (by bumping wazero dependency)
- remove the need for substreams-tier1 blocktype auto-detection
- fix missing error handling when writing output data to files. This could result in tier1 request just "hanging" waiting for the file never produced by tier2.
- fix handling of dstore error in tier1 'execout walker' causing stalling issues on S3 or on unexpected storage errors
- increase number of retries on storage when writing states or execouts (5 -> 10)
- prevent slow squashing when loading each segment from full KV store (can happen when a stage contains multiple stores)
- Fix a context leak causing tier1 responses to slow down progressively
- fix thread leak in metering affecting substreams
- revert a substreams scheduler optimisation that causes slow restarts when close to head
- add substreams_tier2_active_requests and substreams_tier2_request_counter prometheus metrics
- Substreams bumped to @v1.5.0: See https://github.com/streamingfast/substreams/releases/tag/v1.5.0 for details.
- A single substreams-tier2 instance can now serve requests for multiple chains or networks. All network-specific parameters are now passed from Tier1 to Tier2 in the internal ProcessRange request.
- This allows you to better use your computing resources by pooling all the networks together.
Important
Since the tier2
services will now get the network information from the tier1
request, you must make sure that the file paths and network addresses will be the same for both tiers.
ex: if --common-merged-blocks-store-url=/data/merged
is set on tier1, make sure the merged blocks are also available from tier2 under the path /data/merged
.
The flags --substreams-state-store-url
, --substreams-state-store-default-tag
, --common-merged-blocks-store-url
, --substreams-rpc-endpoints stringArray
and --substreams-rpc-gas-limit
are now ignored on tier2.
The flag --common-first-streamable-block
should be set to 0 to accommodate every chain.
Non-ethereum chains can query a firehose-ethereum
tier2, but the opposite is not true, since only the firehose-ethereum
implements the eth_call
WASM extension.
Tip
The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files. The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst
when it runs into them.
- All module outputs are now cached. (previously, only the last module was cached, along with the "store snapshots", to allow parallel processing).
- Tier2 will now read back mapper outputs (if they exist) to prevent running them again. Additionally, it will not read back the full blocks if its inputs can be satisfied from existing cached mapper outputs.
- Tier2 will skip processing completely if it's processing the last stage and the
output_module
is a mapper that has already been processed (ex: when multiple requests are indexing the same data at the same time) - Tier2 will skip processing completely if it's processing a stage where all the stores and outputs have been processed and cached.
- Scheduler modification: a stage now waits for the previous stage to have completed the same segment before running, to take advantage of the cached intermediate layers.
- Improved file listing performance for Google Storage backends by 25%!
Tip
Concurrent requests on the same module hashes may benefit from the other requests' work to a certain extent (up to 75%!) -- The very first request does most of the work for the other ones.
Tip
More caches will increase disk usage and there is no automatic removal of old module caches. The operator is responsible for deleting old module caches.
Tip
The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files.
The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst
when it runs into them.
- Readiness metric for Substreams tier1 app is now named
substreams_tier1
(was mistakenly calledfirehose
before). - Added back readiness metric for Substreams tiere app (named
substreams_tier2
). - Added metric
substreams_tier1_active_worker_requests
which gives the number of active Substreams worker requests a tier1 app is currently doing against tier2 nodes. - Added metric
substreams_tier1_worker_request_counter
which gives the total Substreams worker requests a tier1 app made against tier2 nodes.
- Added
--merger-delete-threads
to customize the number of threads the merger will use to delete files. It's recommended to increase this when using Ceph as S3 storage provider to 25 or higher (due to performance issues with deletes the merger might otherwise not be able to delete one-block files fast enough).
-
Fixed
tools check merged-blocks
default range when-r <range>
is not provided to now be[0, +∞]
(was previously[HEAD, +∞]
). -
Fixed
tools check merged-blocks
to be able to run without a block range provided. -
Added API Key based authentication to
tools firehose-client
andtools firehose-single-block-client
, specify the value through environment variableFIREHOSE_API_KEY
(you can use flag--api-key-env-var
to change variable's name to something else thanFIREHOSE_API_KEY
). -
Fixed
tools check merged-blocks
examples using block range (range should be specified as[<start>]?:[<end>]
). -
Added
--substreams-tier2-max-concurrent-requests
to limit the number of concurrent requests to the tier2 Substreams service.
- Adding traceID for RPCCalls
- BlockFetcher: added support for WithdrawalsRoot, BlobGasUsed, BlobExcessGas and ParentBeaconRoot fields when fetching blocks from RPC.
- Substreams: add support for
substreams-tier2-max-concurrent-requests
flag to limit the number of concurrent requests to tier2
Warning
This release deprecates the "RPC Cache (for eth_calls)" feature of substreams: It has been turned off by default and will not be supported in future releases.
The RPC cache was a not-well-known feature that cached all eth_calls responses by default and loaded them on each request.
It is being deprecated because it has a negative impact on global performance.
If you want to cache your eth_call responses, you should do it in a specialized proxy instead of having substreams manage this.
Until the feature is completely removed, you can keep the previous behavior by setting the --substreams-rpc-cache-store-url
flag to a non-empty value (its previous default value was {data-dir}/rpc-cache
)
- Performance: prevent reprocessing jobs when there is only a mapper in production mode and everything is already cached
- Performance: prevent "UpdateStats" from running too often and stalling other operations when running with a high parallel jobs count
- Performance: fixed bug in scheduler ramp-up function sometimes waiting before raising the number of workers
- Added the output module's hash to the "incoming request" log
- Substreams RPC: add
--substreams-rpc-gas-limit
flag to allow overriding default of 50M. Arbitrum chains behave better with a value of0
to avoidintrinsic gas too low (supplied gas 50000000)
errors
-
The
reader-node-bootstrap-url
gained the ability to be bootstrapped from abash
script.If the bootstrap URL is of the form
bash:///<path/to/script>?<parameters>
, the bash script at<path/to/script>
will be executed. The script is going to receive in environment variables the resolved reader node variables in the form ofREADER_NODE_<VARIABLE_NAME>
. The fully resolved node arguments (fromreader-node-arguments
) are passed as args to the bash script. The query parameters accepted are:arg=<value>
| Pass as extra argument to the script, prepended to the list of resolved node argumentsenv=<key>%3d<value>
| Pass as extra environment variable as<key>=<value>
with key being upper-cased (multiple(s) allowed)env_<key>=<value>
| Pass as extra environment variable as<key>=<value>
with key being upper-cased (multiple(s) allowed)cwd=<path>
| Change the working directory to<path>
before running the scriptinterpreter=<path>
| Use<path>
as the interpreter to run the scriptinterpreter_arg=<arg>
| Pass<interpreter_arg>
as arguments to the interpreter before the script path (multiple(s) allowed)
[!NOTE] The
bash:///
script support is currently experimental and might change in upcoming releases, the behavior changes will be clearly documented here.
- Fix JSON decoding in the client tools (firehose-client, print merged-blocks, etc.).
- The block decoding to JSON is broken in the client tools (firehose-client, print merged-blocks, etc.). Use version v2.3.1
- Fix block poller panic on v2.3.2
- This release has a broken RPC poller component. Upgrade to v2.3.3.
- The block decoding to JSON is broken in the client tools (firehose-client, print merged-blocks, etc.). Use version v2.3.1
- Add missing metering events for
sf.firehose.v2.Fetch/Block
responses. - Changed default polling interval in 'continuous authentication' from 10s to 60s, added 'interval' query param to URL.
- Fixed bug in scheduler ramp-up function sometimes waiting before raising the number of workers
- Fixed load-balancing from tier1 to tier2 when using dns:/// (round-robin policy was not set correctly)
- Added
trace_id
in grpc authentication calls - Bumped connect-go library to new "connectrpc.com/connect" location
- Firehose blocks that were produced using the RPC Poller will have to be extracted again to fix the Transaction Status and the potential missing receipt (ex: arb-one pre-nitro, Avalanche, Optimism ...)
- Fix race condition in RPC Poller which would cause some missing transaction receipts
- Fix conversion of transaction status from RPC Poller: failed transactions would show up as "status unknown" in firehose blocks.
- Added the support the FORCE_FINALITY_AFTER_BLOCKS environment variable: setting it to a value like '200' will make the 'reader' mark blocks as final after a maximum of 200 block confirmations, even if the chain implements finality via a beacon that lags behind.
-
Reduce logging and logging "payload".
-
Tools printing Firehose
Block
model to JSON now have--proto-paths
take higher precedence over well-known types and even the chain itself, the order is--proto-paths
>chain
>well-known
(sowell-known
is lookup last). -
The
tools print one-block
now works correctly on blocks generated by omni-chainfirecore
binary. -
The various health endpoint now sets
Content-Type: application/json
header prior sending back their response to the client. -
The
firehose
,substreams-tier1
andsubstream-tier2
health endpoint now respects thecommon-system-shutdown-signal-delay
configuration value meaning that the health endpoint will returnfalse
now ifSIGINT
has been received but we are still in the shutdown unready period defined by the config value. If you use some sort of load balancer, you should make sure they are configured to use the health endpoint and you shouldcommon-system-shutdown-signal-delay
to something like15s
. -
Changed
reader
logger back toreader-node
to fit with the app's name which isreader-node
. -
Fix
tools compare-blocks
that would fail on new format. -
Fix
substreams
to correctly delete.partial
files when serving a request that is not on a boundary
The Cancun hard fork happened on Goerli and after further review, we decided to change the Protobuf definition for the new BlockHeader
, Transaction
and TransactionReceipt
fields that are related to blob transaction.
We made explicit that those fields are optional in the Protobuf definition which will render them in your language of choice using the appropriate "null" mechanism. For example on Golang, those fields are generated as BlobGasUsed *uint64
and ExcessBlobGas *uint64
which will make it clear that those fields are not populated at all.
The affected fields are:
- BlockHeader.blob_gas_used, now
optional uint64
. - BlockHeader.excess_blob_gas, now
optional uint64
. - TransactionTrace.blob_gas, now
optional uint64
. - TransactionTrace.blob_gas_fee_cap, now
optional BigInt
. - TransactionReceipt.blob_gas_used, now
optional uint64
. - TransactionReceipt.blob_gas_price, now
optional BigInt
.
This is technically a breaking change for those that could have consumed those fields already but we think he impact is so minimal that it's better to make the change right now.
You will need to reprocess a small Goerli range. You should update to new version to produce the newer version and the reprocess from block 10377700 up to when you upgraded to v2.2.2.
The block 10377700 was chosen since it is the block at the time of the first release we did supporting Cancun where we introduced those new field. If you know when you deploy either v2.2.0
or v2.2.1
, you should reprocess from that point.
An alternative to reprocessing is updating your blocks by having a StreamingFast API Token and using fireeth tools download-from-firehose goerli.eth.streamingfast.io:443 -a SUBSTREAMS_API_TOKEN 10377700:<recent block rounded to 100s> <destination>
.
Note
You should download the blocks to a temporary destination and copy over to your production destination once you have them all.
You can reach to us on Discord if you need help on something.
- Updated the documentation for some of the upcoming new Cancun hard-fork fields:
- Added support for EIP-4844 (upcoming with activation of Cancun fork), through instrumented go-ethereum nodes with version
fh2.4
. This adds new fields in the Ethereum Block model, fields that will be non-empty when the Ethereum network your pulling have EIP-4844 activated. The fields in questions are:- Block.system_calls
- BlockHeader.blob_gas_used
- BlockHeader.excess_blob_gas
- BlockHeader.parent_beacon_root
- TransactionTrace.blob_gas
- TransactionTrace.blob_gas_fee_cap
- TransactionTrace.blob_hashes
- TransactionReceipt.blob_gas_used
- TransactionReceipt.blob_gas_price
- A new
TransactionTrace.Type
value TRX_TYPE_BLOB
Important
Operators running Goerli chain will need to upgrade to this version, with this geth node release: https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.13.10-fh2.4
- Fixed error-passing between tier2 and tier1 (tier1 will not retry sending requests that fail deterministicly to tier2)
- Tier1 will now schedule a single job on tier2, quickly ramping up to the requested number of workers after 4 seconds of delay, to catch early exceptions
- "store became too big" is now considered a deterministic error and returns code "InvalidArgument"
- Added
tools poller generic-evm
subcommand. It is identical to optimism/arb-one in feature at the moment and should work for most evm chains.
- Bump to major release firehose-core v1.0.0
Important
When upgrading your stack to this release, be sure to upgrade all components simultaneously because the block encapsulation format has changed. Blocks that are merged using the new merger will not be readable by previous versions. There is no simple way to revert, except by deleting the all the one-blocks and merged-blocks that were produced with this version.
- Blocks files (one-blocks and merged) are now stored with a new format using
google.protobuf.any
format. Previous blocks can still be read and processed.
- Added RPC pollers for Optimism and Arb-one: These can be used from by running the reader-node with
--reader-node-path=/path/to/fireeth
and--reader-node-arguments="tools poller {optimism|arb-one} [more flags...]"
- Added
tools fix-any-type
to rewrite the previous merged-blocks (OPTIONAL)
- Fixed grpc error code when shutting down: changed from Canceled to Unavailable
- Fixed SF_TRACING feature (regression broke the ability to specify a tracing endpoint)
- Fixed substreams GRPC/Connect error codes not propagating correctly
- Firehose connections rate-limiting will now force an (increased) delay of between 1 and 4 seconds (random value) before refusing a connection when under heavy load
- Fixed the
fix-polygon-index
tool (parsing error made it unusable in v2.0.0-rc.1) - Fixed some false positives in
compare-blocks-rpc
This releases refactor firehose-ethereum
repository to use the common shared Firehose Core library (https://github.com/streamingfast/firehose-core) that every single Firehose supported chain should use and follow.
Both at the data level and gRPC level, there is no changes in behavior to all core components which are reader-node
, merger
, relayer
, firehose
, substreams-tier1
and substreams-tier2
.
A lot of changes happened at the operators level however and some superflous mode have been removed, especially around the reader-node
application. The full changes is listed below, operators should review thoroughly the changelog.
Important
It's important to emphasis that at the data level, nothing changed, so reverting to 1.4.22 in case of a problem is quite easy and no special data migration is required outside of changing back to the old set of flags that was used before.
You will find below the detailed upgrade procedure for the configuration file operators usually use. If you are using the flags based approach, simply update the corresponding flags.
Important
We have had reports of older versions of this software creating corrupted merged-blocks-files (with duplicate or out-of-bound blocks) This release adds additional validation of merged-blocks to prevent serving duplicate blocks from the firehose or substreams service. This may cause service outage if you have produced those blocks or downloaded them from another party who was affected by this bug. See the Finding and fixing corrupted merged-blocks-files to see how you can prevent service outage.
Here a bullet list for upgrading your instance, we still recommend to fully read each section below, the list here can serve as a check list. The list below is done in such way that you get back the same "instance" as before. The listening addresses changes can be omitted as long as you update other tools to account for the port changes list your load balancer.
-
Add config
config-file: ./sf.yaml
if not present already -
Add config
data-dir: ./sf-data
if not present already -
Rename config
verbose
tolog-verbosity
if present -
Add config
common-blocks-cache-dir: ./sf-data/blocks-cache
if not present already -
Remove config
common-chain-id
if present -
Remove config
common-deployment-id
if present -
Remove config
common-network-id
if present -
Add config
common-live-blocks-addr: :13011
if not present already -
Add config
relayer-grpc-listen-addr: :13011
ifcommon-live-blocks-addr
has been added in previous step -
Add config
reader-node-grpc-listen-addr: :13010
if not present already -
Add config
relayer-source: :13010
ifreader-node-grpc-listen-addr
has been added in previous step -
Remove config
reader-node-enforce-peers
if present -
Remove config
reader-node-log-to-zap
if present -
Remove config
reader-node-ipc-path
if present -
Remove config
reader-node-type
if present -
Replace config
reader-node-arguments: +--<flag1> --<flag2> ...
byreader-node-arguments: --networkid=<network-id> --datadir={node-data-dir} --port=30305 --http --http.api=eth,net,web3 --http.port=8547 --http.addr=0.0.0.0 --http.vhosts=* --firehose-enabled --<flag1> --<flag2> ...
[!NOTE] The
<network-id>
is dynamic and should be replace with a literal value like1
for Ethereum Mainnet. The{node-data-dir}
value is actually a templating value that is going o be resolved for you (resolves to value of configreader-node-data-dir
).![!IMPORTANT] Ensure that
--firehose-enabled
is part of the flag! Moreover, tweak flags to avoid repetitions if your were overriding some of them. -
Remove
node
understart: args:
list -
Add config
merger-grpc-listen-addr: :13012
if not present already -
Add config
firehose-grpc-listen-addr: :13042
if not present already -
Add config
substreams-tier1-grpc-listen-addr: :13044
if not present already -
Add config
substreams-tier1-grpc-listen-addr: :13044
if not present already -
Add config
substreams-tier2-grpc-listen-addr: :13045
if not present already -
Add config
substreams-tier1-subrequests-endpoint: :13045
ifsubstreams-tier1-grpc-listen-addr
has been added in previous step -
Replace config
combined-index-builder
toindex-builder
understart: args:
list -
Rename config
common-block-index-sizes
tocommon-index-block-sizes
if present -
Rename config
combined-index-builder-grpc-listen-addr
toindex-builder-grpc-listen-addr
if present -
Add config
index-builder-grpc-listen-addr: :13043
if you didn't havecombined-index-builder-grpc-listen-addr
previously -
Rename config
combined-index-builder-index-size
toindex-builder-index-size
if present -
Rename config
combined-index-builder-start-block
toindex-builder-start-block
if present -
Rename config
combined-index-builder-stop-block
toindex-builder-stop-block
if present -
Replace any occurrences of
{sf-data-dir}
to{data-dir}
in any of your configuration values if present
-
The default value for
config-file
changed fromsf.yaml
tofirehose.yaml
. If you didn't had this flag defined and wish to keep the old default, defineconfig-file: sf.yaml
. -
The default value for
data-dir
changed fromsf-data
tofirehose-data
. If you didn't had this flag defined before, you should either movesf-data
tofirehose-data
or definedata-dir: sf-data
.[!NOTE] This is an important change, forgetting to change it will change expected locations of data leading to errors or wrong data.
-
Deprecated The
{sf-data-dir}
templating argument used in various flags to resolve to the--data-dir=<location>
value has been deprecated and should now be simply{data-dir}
. The older replacement is still going to work but you should replace any occurrences of{sf-data-dir}
in your flag definition by{data-dir}
. -
The default value for
common-blocks-cache-dir
changed from{sf-data-dir}/blocks-cache
tofile://{data-dir}/storage/blocks-cache
. If you didn't had this flag defined and you hadcommon-blocks-cache-enabled: true
, you should definecommon-blocks-cache-dir: file://{data-dir}/blocks-cache
. -
The default value for
common-live-blocks-addr
changed from:13011
to:10014
. If you didn't had this flag defined and wish to keep the old default, definecommon-live-blocks-addr: 13011
and ensure you also modifyrelayer-grpc-listen-addr: :13011
(see next entry for details). -
The Go module
github.com/streamingfast/firehose-ethereum/types
has been removed, if you were depending ongithub.com/streamingfast/firehose-ethereum/types
in your project before, depend directly ongithub.com/streamingfast/firehose-ethereum
instead.[!NOTE] This will pull much more dependencies then before, if you're reluctant of such additions, talk to us on Discord and we can offer alternatives depending on what you were using.
-
The config value
verbose
has been renamed tolog-verbosity
keeping the same semantic and default value as before[!NOTE] The short flag version is still
-v
and can still be provided multiple times like-vvvv
.
This change will impact all operators currently running Firehose on Ethereum so it's important to pay attention to the upgrade procedure below, if you are unsure of something, reach to us on Discord.
Before this release, the reader-node
app was managing for you a portion of the reader-node-arguments
configuration value, prepending some arguments that would be passed to geth
when invoking it, the list of arguments that were automatically provided before:
--networkid=<value of config value 'common-network-id'>
--datadir=<value of config value 'reader-node-data-dir'>
--ipcpath=<value of config value 'reader-node-ipc-path'>
--port=30305
--http
--http.api=eth,net,web3
--http.port=8547
--http.addr=0.0.0.0
--http.vhosts=*
--firehose-enabled
We have now removed those magical additions and operators are now responsible of providing the flags they required to properly run a Firehose-enabled native geth
node. The +
sign that was used to append/override the flags has been removed also since no default additions is performed, the +
was now useless. To make some flag easier to define and avoid repetition, a few templating variable can be used within the reader-node-arguments
value:
{data-dir}
The current data-dir path defined by the config valuedata-dir
{node-data-dir}
The node data dir path defined by the flagreader-node-data-dir
{hostname}
The machine's hostname{start-block-num}
The resolved start block number defined by the flagreader-node-start-block-num
(can be overwritten){stop-block-num}
The stop block number defined by the flagreader-node-stop-block-num
As an example, if you provide the config value reader-node-data-dir=/var/geth
for example, then you could use reader-node-arguments: --datadir={node-data-dir}
and that would resolve to reader-node-arguments: --datadir=/var/geth
for you.
Note
The reader-node-arguments
is a string that is parsed using Shell word splitting rules which means for example that double quotes are supported like --datadir="/var/with space/path"
and the argument will be correctly accepted. We use https://github.com/kballard/go-shellquote as your parsing library.
We also removed the following reader-node
configuration value:
reader-node-type
(No replacement needed, just remove it)reader-node-ipc-path
(If you were using that, define it manually usinggeth
flag--ipcpath=...
)reader-node-enforce-peers
(If you were using that, use ageth
config file to add static peers to your node, read about static peers forgeth
on the Web)
Default listening addresses changed also to be the same on all firehose-<...>
project, meaning consistent ports across all chains for operators. The reader-node-grpc-listen-addr
default listen address went from :13010
to :10010
and reader-node-manager-api-addr
from :13009
to :10011
. If you have no occurrences of 13010
or 13009
in your config file or your scripts, there is nothing to do. Otherwise, feel free to adjust the default port to fit your needs, if you do change reader-node-grpc-listen-addr
, ensure --relayer-source
is also updated as by default it points to :10010
.
Here an example of the required changes.
Change:
start:
args:
- ...
- reader-node
- ...
flags:
...
reader-node-bootstrap-data-url: ./reader/genesis.json
reader-node-enforce-peers: localhost:13041
reader-node-arguments: +--firehose-genesis-file=./reader/genesis.json --authrpc.port=8552
reader-node-log-to-zap: false
...
To:
start:
args:
- ...
- reader-node
- ...
flags:
...
reader-node-bootstrap-data-url: ./reader/genesis.json
reader-node-arguments:
--networkid=1515
--datadir={node-data-dir}
--ipcpath={data-dir}/reader/ipc
--port=30305
--http
--http.api=eth,net,web3
--http.port=8547
--http.addr=0.0.0.0
--http.vhosts=*
--firehose-enabled
--firehose-genesis-file=./reader/genesis.json
--authrpc.port=8552
...
Note
Adjust the --networkid=1515
value to fit your targeted chain, see https://chainlist.org/ for a list of Ethereum chain and their network-id
value.
In previous version of firehose-ethereum
, it was possible to use the node
app to launch managed "peering/backup/whatever" Ethereum node, this is not possible anymore. If you were using the node
app previously, like in this config:
start:
args:
- ...
- node
- ...
flags:
...
node-...
You must now remove the node
app from args
and any flags starting with node-
. The migration path is to run those on your own without the use of fireeth
and using whatever tools fits your desired needs.
We have completely drop support to concentrate on the core mission of Firehose which is to run reader nodes to extract Firehose blocks from it.
Note This is about the
node
app and not thereader-node
, we think usage of this app is minimal/inexistent.
The app has been renamed to simply index-builder
and the flags has been completely renamed removing the prefix combined-
in front of them.
Change:
start:
args:
- ...
- combined-index-builder
- ...
flags:
...
combined-index-builder-grpc-listen-addr: ":9999"
combined-index-builder-index-size: 10000
combined-index-builder-start-block: 0
combined-index-builder-stop-block: 0
...
To:
start:
args:
- ...
- index-builder
- ...
flags:
...
index-builder-grpc-listen-addr: ":9999"
index-builder-index-size: 10000
index-builder-start-block: 0
index-builder-stop-block: 0
...
- Flag
common-block-index-sizes
has been renamed tocommon-index-block-sizes
.
Note
Rename only configuration item you had previously defined, do not copy paste verbatim example above.
-
The default value for
relayer-grpc-listen-addr
changed from:13011
to:10014
. If you didn't had this flag defined and wish to keep the old default, definerelayer-grpc-listen-addr: 13011
and ensure you also modifycommon-live-blocks-addr: :13011
(see previous entry for details). -
The default value for
relayer-source
changed from:13010
to:10010
. If you didn't had this flag defined and wish to keep the old default, definerelayer-source: 13010
and ensure you also modifyreader-node-grpc-listen-addr: :13010
.[!NOTE] Must align with
reader-node-grpc-listen-addr
!
- The default value for
firehose-grpc-listen-addr
changed from:13042
to:10015
. If you didn't had this flag defined and wish to keep the old default, definefirehose-grpc-listen-addr: :13042
. - Firehose logs now include auth information (userID, keyID, realIP) along with blocks + egress bytes sent.
- The default value for
merger-grpc-listen-addr
changed from:13012
to:10012
. If you didn't had this flag defined and wish to keep the old default, definemerger-grpc-listen-addr: :13012
.
-
The default value for
substreams-tier1-grpc-listen-addr
changed from:13044
to:10016
. If you didn't had this flag defined and wish to keep the old default, definesubstreams-tier1-grpc-listen-addr: :13044
. -
The default value for
substreams-tier1-subrequests-endpoint
changed from:13045
to:10017
. If you didn't had this flag defined and wish to keep the old default, definesubstreams-tier1-subrequests-endpoint: :13044
.[!NOTE] Must align with
substreams-tier1-grpc-listen-addr
! -
The default value for
substreams-tier2-grpc-listen-addr
changed from:13045
to:10017
. If you didn't had this flag defined and wish to keep the old default, definesubstreams-tier2-grpc-listen-addr: :13045
.
- Added field
DetailLevel
(Base, Extended(default)) tosf.ethereum.type.v2.Block
to distinguish the new blocks produced from polling RPC (base) from the blocks normally produced with firehose instrumentation (extended)
- Added command
tools fix-bloated-merged-blocks
to go through a range of possibly corrupted merged-blocks (with duplicates and out-of-range blocks) and try to fix them, writing the fixed merged-blocks files to another destination.
- Transform
sf.ethereum.transform.v1.LightBlock
is not supported, this has been deprecated for a long time and should not be used anywhere.
You may have certain merged-blocks files (most likely OLD blocks) that contain more than 100 blocks (with duplicate or extra out-of-bound blocks)
- Find the affected files by running the following command (can be run multiple times in parallel, over smaller ranges)
tools check merged-blocks-batch <merged-blocks-store> <start> <stop>
- If you see any affected range, produce fixed merged-blocks files with the following command, on each range:
tools fix-bloated-merged-blocks <merged-blocks-store> <output-store> <start>:<stop>
- Copy the merged-blocks files created in output-store over to the your merged-blocks-store, replacing the corrupted files.
- Fixed a regression where
reader-node-role
was changed todev
by default, putting back the defaultgeth
value.
- Bump Substreams to
v1.1.20
with a fix for some minor bug fixes related to start block processing
- Added
tools poll-rpc-blocks
command to launch an RPC-based poller that acts as a firehose extractor node, printing base64-encoded protobuf blocks to stdout (used by the 'dev' node-type). It creates "light" blocks, without traces and ordinals. - Added
--dev
flag to thestart
command to simplify running a local firehose+substreams stack from a development node (ex: Hardhat).- This flag overrides the
--reader-node-path
, instead pointing to the fireeth binary itself. - This flag overrides the
--reader-node-type
, setting it todev
instead ofgeth
. This node type has the following defaultreader-node-arguments
:tools poll-rpc-blocks http://localhost:8545 0
- It also removes
node
from the list of default apps
- This flag overrides the
- Substreams: fixed metrics calculations (per-module processing-time and external calls were wrong)
- Substreams: fixed immediate EOF when streaming from block 0 to (unbounded) in dev mode
- Bumped substreams to
v1.1.18
with a regression fix for when a substreams has a start block in the reversible segment
- Bumped substreams to
v1.1.17
with fixmissing decrement on metrics
substreams_active_requests`
The --common-auth-plugin
got back the ability to use secret://<expected_secret>?[user_id=<user_id>]&[api_key_id=<api_key_id>]
in which case request are authenticated based on the Authorization: Bearer <actual_secret>
and continue only if <actual_secret> == <expected_secret>
.
- Bumped substreams to
v1.1.16
with support of metricssubstreams_active_requests
andsubstreams_counter
- If you started reprocessing the blockchain blocks using release v1.4.14 or v1.4.15, you will need to run the following command to fix the blocks affected by another bug:
fireeth tools fix-polygon-index /your/merged/blocks /temporary/destination 0 48200000
(note that you can run multiple instances of this command in parallel to cover the range of blocks from 0 to current HEAD in smaller chunks)
- Fix another data issue found in polygon blocks: blocks that contain a single "system" transaction have "Index=1" for that transaction instead of "Index=0"
- (Substreams) fixed regressions for relative start-blocks for substreams (see https://github.com/streamingfast/substreams/releases/tag/v1.1.14)
If you are indexing Polygon or Mumbai chains, you will need to reprocess the chain from genesis, as your existing Firehose blocks are missing some system transactions.
As always, this can be done with multiple client nodes working in parallel on different chain's segment if you have snapshots at various block heights.
Golang 1.21+
is now also required to build the project.
- Fixed post-processing of polygon blocks: some system transactions were not "bundled" correctly.
- (Substreams) fixed validations for invalid start-blocks (see https://github.com/streamingfast/substreams/releases/tag/v1.1.13)
- Added
tools compare-oneblock-rpc
command to perform a validation between a firehose 'one-block-file' blocks+trx+logs fetched from an RPC endpoint
- The
tools print
subcommands now use hex to encode values instead of base64, making them easier to use
Important
The Substreams service exposed from this version will send progress messages that cannot be decoded by substreams clients prior to v1.1.12. Streaming of the actual data will not be affected. Clients will need to be upgraded to properly decode the new progress messages.
- Bumped substreams to
v1.1.12
to support the new progress message format. Progression now relates to stages instead of modules. You can get stage information using thesubstreams info
command starting at versionv1.1.12
.
- added
tools compare-blocks-rpc
command to perform a validation between firehose blocks and blocks+trx+logs fetched from an RPC endpoint
- More tolerant retry/timeouts on filesource (prevent "Context Deadline Exceeded")
This release mainly brings reader-node
Firehose Protocol 2.3 support for all networks and not just Polygon. This is important for the upcoming release of Firehose-enabled geth
version 1.2.11 and 1.2.12 that are going to be releases shortly.
Golang 1.20+
is now also required to build the project.
- Support reader node Firehose Protocol 2.3 on all networks now (and not just Polygon).
- Removed
--substreams-tier1-request-stats
and--substreams-tier1-request-stats
(substreams request-stats are now always sent to clients)
tools check merged-blocks
now correctly prints missing block gaps even without print-full or print-stats.
- Now requires Go 1.20+ to compile the project.
- Substreams bumped: better "Progress" messages
- Bumped
firehose
andsubstreams
library to fix a bug where live blocks were not metered correctly.
- Fixed: jobs would hang when flags
--substreams-state-bundle-size
and--substreams-tier1-subrequests-size
had different values. The latter flag has been completely removed, subrequests will be bound to the state bundle size.
- Added support for continuous authentication via the grpc auth plugin (allowing cutoff triggered by the auth system).
The substreams
server now accepts X-Sf-Substreams-Cache-Tag
header to select which Substreams state store URL should be used by the request. When performing a Substreams request, the servers will pick the state store based on the header. This enable consumers to stay on the same cache version when the operators needs to bump the data version (reasons for this could be a bug in Substreams software that caused some cached data to be corrupted on invalid).
To benefit from this, operators that have a version currently in their state store URL should move the version part from --substreams-state-store-url
to the new flag --substreams-state-store-default-tag
. For example if today you have in your config:
start:
...
flags:
substreams-state-store-url: /<some>/<path>/v3
You should convert to:
start:
...
flags:
substreams-state-store-url: /<some>/<path>
substreams-state-store-default-tag: v3
The substreams
scheduler has been improved to reduce the number of required jobs for parallel processing. This affects backprocessing
(preparing the states of modules up to a "start-block") and forward processing
(preparing the states and the outputs to speed up streaming in production-mode).
Jobs on tier2
workers are now divided in "stages", each stage generating the partial states for all the modules that have the same dependencies. A substreams
that has a single store won't be affected, but one that has 3 top-level stores, which used to run 3 jobs for every segment now only runs a single job per segment to get all the states ready.
The app substreams-tier1
and substreams-tier2
should be upgraded concurrently. Some calls will fail while versions are misaligned.
- Substreams bumped to version v1.1.9
- Authentication plugin
trust
can now specify an exclusive list ofallowed
headers (all lowercase), ex:trust://?allowed=x-sf-user-id,x-sf-api-key-id,x-real-ip,x-sf-substreams-cache-tag
- The
tier2
app no longer uses thecommon-auth-plugin
,trust
will always be used, so thattier1
can pass down its headers (ex:X-Sf-Substreams-Cache-Tag
).
- Fixed a bug in
substreams-tier1
andsubstreams-tier2
which caused "live" blocks to be sent while the stream previously received block(s) were historic.
- Added a check for readiness of the
dauth
provider when answering "/healthz" on firehose and substreams
- Changed
--substreams-tier1-debug-request-stats
to--substreams-tier1-request-stats
which enabled request stats logging on Substreams Tier1 - Changed
--substreams-tier2-debug-request-stats
to--substreams-tier2-request-stats
which enabled request stats logging on Substreams Tier2
- Fixed an occasional panic in substreams-tier1 caused by a race condition
- Fixed the grpc error codes for substreams tier1: Unauthenticated on bad auth, Canceled (endpoint is shutting down, please reconnect) on shutdown
- Fixed the grpc healthcheck method on substreams-tier1 (regression)
- Fixed the default value for flag
common-auth-plugin
: now set to 'trusted://' instead of panicking on removed 'null://'
- Substreams (@v1.1.6) is now out of the
firehose
app, and must be started usingsubstreams-tier1
andsubstreams-tier2
apps! - Most substreams-related flags have been changed:
- common:
--substreams-rpc-cache-chunk-size
,--substreams-rpc-cache-store-url
,--substreams-rpc-endpoints
,--substreams-state-bundle-size
,--substreams-state-store-url
- tier1:
--substreams-tier1-debug-request-stats
,--substreams-tier1-discovery-service-url
,--substreams-tier1-grpc-listen-addr
,--substreams-tier1-max-subrequests
,--substreams-tier1-subrequests-endpoint
,--substreams-tier1-subrequests-insecure
,--substreams-tier1-subrequests-plaintext
,--substreams-tier1-subrequests-size
- tier2:
--substreams-tier2-discovery-service-url
,--substreams-tier2-grpc-listen-addr
- common:
- Some auth plugins have been removed, the new available plugins for
--common-auth-plugins
aretrust://
andgrpc://
. See https://github.com/streamingfast/dauth for details - Metering features have been added, the available plugins for
--common-metering-plugin
arenull://
,logger://
,grpc://
. See https://github.com/streamingfast/dmetering for details
- Support for reader node Firehose Protocol 2.3 (for parallel processing of transactions, added to polygon 'bor' v0.4.0)
- Removed the
tools upgrade-merged-blocks
command. Normalization is now part of consolereader within 'codec', not the 'types' package, and cannot be done a posteriori. - Updated metering to fix dependencies
- Updated metering (bumped versions of
dmetering
,dauth
, andfirehose
libraries.) - Fixed firehose service healthcheck on shutdown
- Fixed panic on download-blocks-from-firehose tool
- When upgrading a substreams server to this version, you should delete all existing module caches to benefit from deterministic output
- Switch default engine from
wasmtime
towazero
- Prevent reusing memory between blocks in wasm engine to fix determinism
- Switch our store operations from bigdecimal to fixed point decimal to fix determinism
- Sort the store deltas from
DeletePrefixes()
to fix determinism - Implement staged module execution within a single block.
- "Fail fast" on repeating requests with deterministic failures for a "blacklist period", preventing waste of resources
- SessionInit protobuf message now includes resolvedStartBlock and MaxWorkers, sent back to the client
- This release brings an update to
substreams
tov1.1.4
which includes the following:- Changes the module hash computation implementation to allow reusing caches accross substreams that 'import' other substreams as a dependency.
- Faster shutdown of requests that fail deterministically
- Fixed memory leak in RPC calls
Note This upgrade procedure is applies if your Substreams deployment topology includes both
tier1
andtier2
processes. If you have defined somewhere the config valuesubstreams-tier2: true
, then this applies to you, otherwise, if you can ignore the upgrade procedure.
The components should be deployed simultaneously to tier1
and tier2
, or users will end up with backend error(s) saying that some partial file are not found. These errors will be resolved when both tiers are upgraded.
- Added Substreams scheduler tracing support. Enable tracing by setting the ENV variables
SF_TRACING
to one of the following:stdout://
cloudtrace://[host:port]?project_id=<project_id>&ratio=<0.25>
jaeger://[host:port]?scheme=<http|https>
zipkin://[host:port]?scheme=<http|https>
otelcol://[host:port]
- This release brings an update to
substreams
tov1.1.3
which includes the following:- Fixes an important bug that could have generated corrupted store state files. This is important for developers and operators.
- Fixes for race conditions that would return a failure when multiple identical requests are backprocessing.
- Fixes and speed/scaling improvements around the engine.
Note This upgrade procedure is applies if your Substreams deployment topology includes both
tier1
andtier2
processes. If you have defined somewhere the config valuesubstreams-tier2: true
, then this applies to you, otherwise, if you can ignore the upgrade procedure.
This release includes a small change in the internal RPC layer between tier1
processes and tier2
processes. This change requires an ordered upgrade of the processes to avoid errors.
The components should be deployed in this order:
- Deploy and roll out
tier1
processes first - Deploy and roll out
tier2
processes in second
If you upgrade in the wrong order or if somehow tier2
processes start using the new protocol without tier1
being aware, user will end up with backend error(s) saying that some partial file are not found. Those will be resolved only when tier1
processes have been upgraded successfully.
- Substreams running without a specific tier2
substreams-client-endpoint
will now expose tier2 servicesf.substreams.internal.v2.Substreams
so it can be used internally.
Warning If you don't use dedicated tier2 nodes, make sure that you don't expose
sf.substreams.internal.v2.Substreams
to the public (from your load-balancer or using a firewall)
- flag
substreams-partial-mode-enabled
renamed tosubstreams-tier2
- flag
substreams-client-endpoint
now defaults to empty string, which means it is its own client-endpoint (as it was before the change to protocol V2)
Substreams protocol changed from sf.substreams.v1.Stream/Blocks
to sf.substreams.rpc.v2.Stream/Blocks
for client-facing service. This changes the way that substreams clients are notified of chain reorgs.
All substreams clients need to be upgraded to support this new protocol.
See https://github.com/streamingfast/substreams/releases/tag/v1.1.1 for details.
firehose-client
tool now accepts--limit
flag to only send that number of blocks. Get the latest block like this:fireeth tools firehose-client <endpoint> --limit=1 -- -1 0
This is a bug fix release for node operators that are about to upgrade to Shanghai release. The Firehose instrumented geth
compatible with Shanghai release introduced a new message CANCEL_BLOCK
. It seems in some circumstances, we had a bug in the console reader that was actually panicking but the message was received but no block was actively being assembled.
This release fix this bogus behavior by simply ignoring CANCEL_BLOCK
message when there is no active block which is harmless. Every node operators that upgrade to https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.11.5-fh2.2 should upgrade to this version.
Note There is no need to update the Firehose instrumented
geth
binary, onlyfireeth
needs to be bumped if you already are at the latestgeth
version.
- Fixed a bug on console reader when seeing
CANCEL_BLOCK
on certain circumstances.
-
Now using Golang 1.20 for building releases.
-
Changed default value of flag
substreams-sub-request-block-range-size
from1000
to10000
.
- Fixed a bug in data normalization for Polygon chain which would cause panics on certain blocks.
- Support for gcp
archive
types of snapshots
- This release implements the new
CANCEL_BLOCK
instruction from Firehose protocol 2.2 (fh2.2
), to reject blocks that failed post-validation. - This release fixes polygon "StateSync" transactions by grouping the calls inside an artificial transaction.
If you had previous blocks from a Polygon chain (bor), you will need to reprocess all your blocks from the node because some StateSync transactions may be missing on some blocks.
This release now supports the new Firehose node exchange format 2.2 which introduced a new exchanged message CANCEL_BLOCK
. This has an implication on the Firehose instrumented Geth
binary you can use with the release.
- If you use Firehose instrumented
Geth
binary taggedfh2.2
(likegeth-v1.11.4-fh2.2-1
), you must usefirehose-ethereum
version>= 1.3.6
- If you use Firehose instrumented
Geth
binary taggedfh2.1
(likegeth-v1.11.3-fh2.1
), you can usefirehose-ethereum
version>= 1.0.0
New releases of Firehose instrumented Geth
binary for all chain will soon all be tagged fh2.2
, so upgrade to >= 1.3.6
of firehose-ethereum
will be required.
This release is required if you run on Goerli and is mostly about supporting the upcoming Shanghai fork that has been activated on Goerli on March 14th.
- Added support for
withdrawal
balance change reason in block model, this is required for running on most recent Goerli Shanghai hard fork. - Added support for
withdrawals_root
onHeader
in the block model, this will be populated only if the chain has activated Shanghai hard fork. --substreams-max-fuel-per-block-module
will limit the number of wasmtime instructions for a single module in a single block.
Blocks that were migrated from v2 to v3 using the 'upgrade-merged-blocks' should now be considered invalid. The upgrade mechanism did not correctly fix the "caller" on DELEGATECALLs when these calls were nested under another DELEGATECALL.
You should run the upgrade-merged-blocks
again if you previously used 'v2' blocks that were upgraded to 'v3'.
This mechanism uses a leaky-bucket mechanism, allowing an initial burst of X connections, allowing a new connection every Y seconds or whenever an existing connection closes.
Use --firehose-rate-limit-bucket-size=50
and --firehose-rate-limit-bucket-fill-rate=1s
to allow 50 connections instantly, and another connection every second.
Note that when the server is above the limit, it waits 500ms before it returns codes.Unavailable to the client, forcing a minimal back-off.
- Substreams
RpcCall
object are now validated before being performed to ensure they are correct. - Substreams
RpcCall
JSON-RPC code-32602
is now treated as a deterministic error (invalid request). tools compare-blocks
now correctly handle segment health reporting and properly prints all differences with-diff
.tools compare-blocks
now ignores 'unknown fields' in the protobuf message, unless--include-unknown-fields=true
tools compare-blocks
now ignores when a block bundle contains the 'last block of previous bundle' (a now-deprecated feature)
- support for "requester pays" buckets on Google Storage in url, ex:
gs://my-bucket/path?project=my-project-id
- substreams were also bumped to current March 1st develop HEAD
- Increased gRPC max received message size accepted by Firehose and Substreams gRPC endpoints to 25 MiB.
- Command
fireeth init
has been removed, this was a leftover from another time and the command was not working anyway.
- flag
common-auto-max-procs
to optimize go thread management using github.com/uber-go/automaxprocs - flag
common-auto-mem-limit-percent
to specify GOMEMLIMIT based on a percentage of available memory
- Updated to Substreams version
v0.2.0
please refer to release page for further info about Substreams changes.
-
Breaking Config value
substreams-stores-save-interval
andsubstreams-output-cache-save-interval
have been merged together as a single value to avoid potential bugs that would arise when the value is different for those two. The new configuration value is calledsubstreams-cache-save-interval
.- To migrate, remove usage of
substreams-stores-save-interval: <number>
andsubstreams-output-cache-save-interval: <number>
if defined in your config file and replace withsubstreams-cache-save-interval: <number>
, if you had two different value before, pick the biggest of the two as the new value to put. We are currently setting to1000
for Ethereum Mainnet.
- To migrate, remove usage of
- Fixed various issues with
fireeth tools check merged-blocks
- The
stopWalk
error is not reported as a realerror
anymore. Incomplete range
should now be printed more accurately.
- The
- Release made to fix our building workflows, nothing different than v1.3.0.
-
Updated to Substreams
v0.1.0
, please refer to release page for further info about Substreams changes.Warning The state output format for
map
andstore
modules has changed internally to be more compact in Protobuf format. When deploying this new version and using Substreams feature, previous existing state files should be deleted or deployment updated to point to a new store location. The state output store is defined by the flag--substreams-state-store-url
flag.
-
New Prometheus metric
console_reader_trx_read_count
can be used to obtain a transaction rate of how many transactions were read from the node over a period of time. -
New Prometheus metric
console_reader_block_read_count
can be used to obtain a block rate of how many blocks were read from the node over a period of time. -
Added
--header-only
support onfireeth tools firehose-client
. -
Added
HeaderOnly
transform that can be used to return only the Block's header a few top-level fieldsVer
,Hash
,Number
andSize
. -
Added
fireeth tools firehose-prometheus-exporter
to use as a client-side monitoring tool of a Firehose endpoint.
- Deprecated
LightBlock
is deprecated and will be removed in the next major version, it's goal is now much better handled byCombineFilter
transform orHeaderOnly
transform if you required only Block's header.
- Hotfix 'nil pointer' panic when saving uninitialized cache.
- Changed cache file format for stores and outputs (faster with vtproto) -- requires removing the existing state files.
- Various improvements to scheduling.
- Fixed
eth_call
handler not flaggingout of gas
error as deterministic. - Fixed Memory leak in wasmtime.
- Removed the unused 'previous' one-block in merged-blocks (99 inside bundle:100).
- Fix: also prevent rare bug of bundling "very old" one-blocks in merged-blocks.
- Added
sf.firehose.v2.Fetch/Block
endpoint on firehose, allows fetching single block by num, num+ID or cursor. - Added
tools firehose-single-block-client
to call that new endpoint.
- Renamed tools
normalize-merged-blocks
toupgrade-merged-blocks
.
- Fixed
common-blocks-cache-dir
flag's description. - Fixed
DELEGATECALL
'scaller
(a.k.afrom
). -> requires upgrade of blocks toversion: 3
- Fixed
execution aborted (timeout = 5s)
hard-coded timeout value when detecting in Substreams ifeth_call
error response was deterministic.
Assuming that you are running a firehose deployment v1.1.0 writing blocks to folders /v2-oneblock
, /v2-forked
and /v2
,
you will deploy a new setup that writes blocks to folders /v3-oneblock
, v3-forked
and /v3
This procedure describes an upgrade without any downtime. With proper parallelization, it should be possible to complete this upgrade within a single day.
- Launch a new reader with this code, running instrumented geth binary: https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.10.25-fh2.1 (you can start from a backup that is close to head)
- Upgrade your merged-blocks from
version: 2
toversion: 3
usingfireeth tools upgrade-merged-blocks /path/to/v2 /path/to/v3 {start} {stop}
(you can run multiple upgrade commands in parallel to cover the whole blocks range) - Create combined indexes from those new blocks with
fireeth start combined-index-builder
(you can run multiple commands in parallel to fill the block range) - When your merged-blocks have been upgraded and the one-block-files are being produced by the new reader, launch a merger
- When the reader, merger and combined-index-builder caught up to live, you can launch the relayer(s), firehose(s)
- When the firehoses are ready, you can now switch traffic to them.
- Added 'SendAllBlockHeaders' param to CombinedFilter transform when we want to prevent skipping blocks but still want to filter out trxs.
- Reduced how many times
reader read statistics
is displayed down to each 30s (previously each 5s) (and re-wrote log toreader node statistics
).
- Fix
fireeth tools download-blocks-from-firehose
tool that was not working anymore. - Simplify
forkablehub
startup performance cases. - Fix relayer detection of a hole in stream blocks (restart on unrecoverable issue).
- Fix possible panic in hub when calls to one-block store are timing out.
- Fix merger slow one-block-file deletions when there are more than 10000 of them.
- The binary name has changed from
sfeth
tofireeth
(aligned with https://firehose.streamingfast.io/references/naming-conventions) - The repo name has changed from
sf-ethereum
tofirehose-ethereum
- This will require reprocessing the chain to produce new blocks
- Protobuf Block model is now tagged
sf.ethereum.type.v2
and contains the following improvements:- Fixed Gas Price on dynamic transactions (post-London-fork on ethereum mainnet, EIP-1559)
- Added "Total Ordering" concept, 'Ordinal' field on all events within a block (trx begin/end, call, log, balance change, etc.)
- Added TotalDifficulty field to ethereum blocks
- Fixed wrong transaction status for contract deployments that fail due to out of gas on pre-Homestead transactions (aligned with status reported by chain: SUCCESS -- even if no contract code is set)
- Added more instrumentation around AccessList and DynamicFee transaction, removed some elements that were useless or could not be derived from other elements in the structure, ex: gasEvents
- Added support for finalized block numbers (moved outside the proto-ethereum block, to firehose bstream v2 block)
- There are no more "forked blocks" in the merged-blocks bundles:
- The merged-blocks are therefore produced only after finality passed (before The Merge, this means after 200 confirmations).
- One-block-files close to HEAD stay in the one-blocks-store for longer
- The blocks that do not make it in the merged-blocks (forked out because of a re-org) are uploaded to another store (common-forked-blocks-store-url) and kept there for a while (to allow resolving cursors)
- This will require changes in most firehose clients
- A compatibility layer has been added to still support
sf.firehose.v1.Stream/Blocks
but only for specific values for 'ForkSteps' in request: 'irreversible' or 'new+undo' - The Firehose Blocks protocol is now under
sf.firehose.v2
(bumped fromsf.firehose.v1
).- Step type
IRREVERSIBLE
renamed toFINAL
Blocks
request now only allows 2 modes regarding steps:NEW,UNDO
andFINAL
(gated by thefinal_blocks_only
boolean flag)- Blocks that are sent out can have the combined step
NEW+FINAL
to prevent sending the same blocks over and over if they are already final
- Step type
- Removed the Irreversible indices completely (because the merged-blocks only contain final blocks now)
- Deprecated the "Call" and "log" indices (
xxxxxxxxxx.yyy.calladdrsig.idx
andxxxxxxxxxx.yyy.logaddrsig.idx
), now replaced by "combined" index - Moved out the
sfeth tools generate-...
command to a new app that can be launched withsfeth start generate-combined-index[,...]
- All config via environment variables that started with
SFETH_
now starts withFIREETH_
- All logs now output on stderr instead of stdout like previously
- Changed
config-file
default from./sf.yaml
to""
, preventing failure without this flag. - Renamed
common-blocks-store-url
tocommon-merged-blocks-store-url
- Renamed
common-oneblock-store-url
tocommon-one-block-store-url
now used by firehose and relayer apps - Renamed
common-blockstream-addr
tocommon-live-blocks-addr
- Renamed the
mindreader
application toreader
- Renamed all the
mindreader-node-*
flags toreader-node-*
- Added
common-forked-blocks-store-url
flag used by merger and firehose - Changed
--log-to-file
default fromtrue
tofalse
- Changed default verbosity level: now all loggers are
INFO
(instead of having most of them toWARN
).-v
will now activate allDEBUG
logs - Removed
common-block-index-sizes
,common-index-store-url
- Removed
merger-state-file
,merger-next-exclusive-highest-block-limit
,merger-max-one-block-operations-batch-size
,merger-one-block-deletion-threads
,merger-writers-leeway
- Added
merger-stop-block
,merger-prune-forked-blocks-after
,merger-time-between-store-pruning
- Removed
mindreader-node-start-block-num
,mindreader-node-wait-upload-complete-on-shutdown
,mindreader-node-merge-and-store-directly
,mindreader-node-merge-threshold-block-age
- Removed
firehose-block-index-sizes
,firehose-block-index-sizes
,firehose-irreversible-blocks-index-bundle-sizes
,firehose-irreversible-blocks-index-url
,firehose-realtime-tolerance
- Removed
relayer-buffer-size
,relayer-merger-addr
,relayer-min-start-offset
- If you depend on the proto file, update
import "sf/ethereum/type/v1/type.proto"
toimport "sf/ethereum/type/v2/type.proto"
- If you depend on the proto file, update all occurrences of
sf.ethereum.type.v1.<Something>
tosf.ethereum.type.v2.<Something>
- If you depend on
sf-ethereum/types
as a library, update all occurrences ofgithub.com/streamingfast/firehose-ethereum/types/pb/sf/ethereum/type/v1
togithub.com/streamingfast/firehose-ethereum/types/pb/sf/ethereum/type/v2
.
- The
reader
requires Firehose-instrumented Geth binary with instrumentation version 2.x (taggedfh2
) - Because of the changes in the ethereum block protocol, an existing deployment cannot be migrated in-place.
- You must deploy firehose-ethereum v1.0.0 on a new environment (without any prior block or index data)
- You can put this new deployment behind a GRPC load-balancer that routes
/sf.firehose.v2.Stream/*
and/sf.firehose.v1.Stream/*
to your different versions. - Go through the list of changed "Flags and environment variables" and adjust your deployment accordingly.
- Determine a (shared) location for your
forked-blocks
. - Make sure that you set the
one-block-store
andforked-blocks-store
correctly on all the apps that now require it. - Add the
generate-combined-index
app to your new deployment instead of thetools
command for call/logs indices.
- Determine a (shared) location for your
- If you want to reprocess blocks in batches while you set up a "live" deployment:
- run your reader node from prior data (ex: from a snapshot)
- use the
--common-first-streamable-block
flag to a 100-block-aligned boundary right after where this snapshot starts (use this flag on all apps) - perform batch merged-blocks reprocessing jobs
- when all the blocks are present, set the
common-first-streamable-block
flag to 0 on your deployment to serve the whole range
- The
reader
requires Firehose-instrumented Geth binary with instrumentation version 2.x (taggedfh2
) - The
reader
does NOT merge block files directly anymore: you need to run it alongside amerger
:- determine a
start
andstop
block for your reprocessing job, aligned on a 100-blocks boundary right after your Geth data snapshot - set
--common-first-streamable-block
to your start-block - set
--merger-stop-block
to your stop-block - set
--common-one-block-store-url
to a local folder accessible to bothmerger
andmindreader
apps - set
--common-merged-blocks-store-url
to the final (ex: remote) folder where you will store your merged-blocks - run both apps like this
fireeth start reader,merger --...
- determine a
- You can run as many batch jobs like this as you like in parallel to produce the merged-blocks, as long as you have data snapshots for Geth that start at this point
- Run batch jobs like this:
fireeth start generate-combined-index --common-blocks-store-url=/path/to/blocks --common-index-store-url=/path/to/index --combined-index-builder-index-size=10000 --combined-index-builder-start-block=0 [--combined-index-builder-stop-block=10000] --combined-index-builder-grpc-listen-addr=:9000
- Added
tools firehose-client
command with filter/index options - Added
tools normalize-merged-blocks
command to remove forked blocks from merged-blocks files (cannot transform ethereum blocks V1 into V2 because some fields are missing in V1) - Added substreams server support in firehose app (alpha) through
--substreams-enabled
flag
- The firehose GRPC endpoint now supports requests that are compressed using
gzip
orzstd
- The merger does not expose
PreMergedBlocks
endpoint over GRPC anymore, only HealthCheck. (relayer does not need to talk to it) - Automatically setting the flag
--firehose-genesis-file
onreader
nodes if theirreader-node-bootstrap-data-url
config value is sets to agenesis.json
file. - Note to other Firehose implementors: we changed all command line flags to fit the required/optional format referred to here: https://en.wikipedia.org/wiki/Usage_message
- Added prometheus boolean metric to all apps called 'ready' with label 'app' (firehose, merger, mindreader-node, node, relayer, combined-index-builder)
- Removed
firehose-blocks-store-urls
flag (feature for using multiple stores now deprecated -> causes confusion and issues with block-caching), usecommon-blocks-sture-url
instead.
- Fixed problem using S3 provider where the S3 API returns empty filename (we ignore at the consuming time when we receive an empty filename result).
- Fixed an issue where the merger could panic on a new deployment
- Fixed an issue where the
merger
would get stuck when too many (more than 2000) one-block-files were lying around, with block numbers below the current bundle high boundary.
- Renamed common
atm
4 flags toblocks-cache
:--common-blocks-cache-{enabled|dir|max-recent-entry-bytes|max-entry-by-age-bytes}
- Fixed
tools check merged-blocks
block hole detection behavior on missing ranges (bumpedsf-tools
) - Fixed a deadlock issue related to s3 storage error handling (bumped
dstore
)
- Added
tools download-from-firehose
command to fetch blocks and save them as merged-blocks files locally. - Added
cloud-gcp://
auth module (bumpeddauth
)
- substreams-alpha client
- gke-pvc-snapshot backup module
- Fixed a potential 'panic' in
merger
on a new chain
- Fixed an issue where the
merger
would get stuck when too many (more than 2000) one-block-files were lying around, with block numbers below the current bundle high boundary.
- Renamed common
atm
4 flags toblocks-cache
:--common-blocks-cache-{enabled|dir|max-recent-entry-bytes|max-entry-by-age-bytes}
- Fixed
tools check merged-blocks
block hole detection behavior on missing ranges (bumpedsf-tools
)
- Added
tools download-from-firehose
command to fetch blocks and save them as merged-blocks files locally. - Added
cloud-gcp://
auth module (bumpeddauth
)
- The default text
encoder
use to encode log entries now emits the level when coloring is disabled. - Default value for flag
--mindreader-node-enforce-peers
is now""
, this has been changed because the default value was useful only in development when running a localnode-manager
as either the miner or a peering node.
- Added block data file caching (called
ATM
), this is to reduce the memory usage of component keeping block objects in memory. - Added transforms: LogFilter, MultiLogFilter, CallToFilter, MultiCallToFilter to only return transaction traces that match logs or called addresses.
- Added support for irreversibility indexes in firehose to prevent replaying reorgs when streaming old blocks.
- Added support for log and call indexes to skip old blocks that do not match any transform filter.
- Updated all Firehose stack direct dependencies.
- Updated confusing flag behavior for
--common-system-shutdown-signal-delay
and its interaction withgRPC
connection draining infirehose
component sometimes preventing it from shutting down. - Reporting an error is if flag
merge-threshold-block-age
is way too low (< 30s).
- Removed some old components that are not required by Firehose stack directly, the repository is as lean as it ca now.
- Fixed Firehose gRPC listening address over plain text.
- Fixed automatic merging of files within the
mindreader
is much more robust then before.