Syncing strategy refactoring (part 2) #5666

nazar-pc · 2024-09-10T16:23:18Z

Description

Follow-up to #5469 and mostly covering #5333.

The primary change here is that syncing strategy is no longer created inside of syncing engine, instead syncing strategy is an argument of syncing engine, more specifically it is an argument to build_network that most downstream users will use. This also extracts addition of request-response protocols outside of network construction, making sure they are physically not present when they don't need to be (imagine syncing strategy that uses none of Substrate's protocols in its implementation for example).

This technically allows to completely replace syncing strategy with whatever strategy chain might need.

There will be at least one follow-up PR that will simplify SyncingStrategy trait and other public interfaces to remove mentions of block/state/warp sync requests, replacing them with generic APIs, such that strategies where warp sync is not applicable don't have to provide dummy method implementations, etc.

Integration

Downstream projects will have to write a bit of boilerplate calling build_polkadot_syncing_strategy function to create previously default syncing strategy.

Review Notes

Please review PR through individual commits rather than the final diff, it will be easier that way. The changes are mostly just moving code around one step at a time.

Checklist

My PR includes a detailed description as outlined in the "Description" and its two subsections above.
My PR follows the labeling requirements of this project (at minimum one label for T required)
- External contributors: ask maintainers to put the right label on your PR.
I have made corresponding changes to the documentation (if applicable)

… strategy

…ategy

nazar-pc · 2024-09-10T16:24:09Z

@dmitry-markin @lexnv another one for you folks

dmitry-markin · 2024-09-12T15:10:19Z

Impeccable! Once again thanks for excellent work!

…refactoring-part-2 # Conflicts: # polkadot/node/service/src/lib.rs

lexnv · 2024-09-13T08:46:45Z

substrate/client/network/sync/src/strategy/warp.rs

+		let request = WarpProofRequest { begin };
+
+		let Some(protocol_name) = self.protocol_name.clone() else {
+			warn!(


nit: Wouldn't this generate too much noise?

It will not, this method is not supposed to be called when there is no need for warp requests, that would basically be an implementation bug. This code effectively migrated from substrate/client/network/sync/src/engine.rs (see the change at the bottom of the file). Should be clear if looking at individual commits.

It is just that the invariants for warp sync are a bit strange, which causes this property to be optional (I only did mechanical refactoring, I didn't change the algorithm).

lexnv

LGTM! Thanks for contributing! 🙏

nazar-pc · 2024-09-17T01:19:02Z

What else is left here? I have another set of changes that depend on this for review already 🙂

bkchr · 2024-09-17T14:59:31Z

substrate/client/service/src/builder.rs

 	/// Metrics.
 	pub metrics: NotificationMetrics,
 }

 /// Build the network service, the network status sinks and an RPC sender.
-pub fn build_network<TBl, TNet, TExPool, TImpQu, TCl>(
-	params: BuildNetworkParams<TBl, TNet, TExPool, TImpQu, TCl>,
+pub fn build_network<Block, Net, TxPool, IQ, Client>(


While I'm late to the party (sorry).

Before this goes into any stable release or whatever, can we please revert this?

Basically we should not expose build_polkadot_syncing_strategy . Instead we should do this internally in this build_network function and then expose some build_network_advanced or whatever for @nazar-pc that takes then the syncing strategy. This way there is no breaking change to the outside for developers. (that only have a node and not any custom syncing stuff)

The end goal should be that all the node/service changes to all the nodes in this repo are reverted.

@lexnv or @dmitry-markin can you please take care of this?

#5737 actually extracted the whole syncing strategy out of build_network and exposed build_default_syncing_engine function to construct the typical setup.

The fact that build_network was not just building the network, but also syncing engine, syncing strategies and hardcoding all the protocol was a long standing problem IMO. It was also taking (and still does) the whole sc_service::config::Configuration, most of which it doesn't use, which is also a problem (though I didn't address it in any of my changes). It makes using Substrate as a library quite painful for those who need to customize it the way we do at Subspace.

My goal would be for build_network to be simplified to little more than Net::new() call. The fact that it unconditionally adds a whole bunch of the protocols is annoying and undesirable.

Having build_network_advanced is an option, but not a long-term solution the way I imagine it. Happy to push it into #5737 if that is something you feel strongly about.

I guess the gist of it is that build_network should build the network, not syncing engines and other stuff it is coupled with right now.

I guess the gist of it is that build_network should build the network, not syncing engines and other stuff it is coupled with right now.

I mean I'm not against this and I support this. Just about how it is done. I mean just keep the function do what it was doing all the time and then provide some extended way.

Reverted build_network's API to before this PR and added build_network_advanced instead in #5737

…_network` API before paritytech#5666

# Description Follow-up to paritytech#5469 and mostly covering paritytech#5333. The primary change here is that syncing strategy is no longer created inside of syncing engine, instead syncing strategy is an argument of syncing engine, more specifically it is an argument to `build_network` that most downstream users will use. This also extracts addition of request-response protocols outside of network construction, making sure they are physically not present when they don't need to be (imagine syncing strategy that uses none of Substrate's protocols in its implementation for example). This technically allows to completely replace syncing strategy with whatever strategy chain might need. There will be at least one follow-up PR that will simplify `SyncingStrategy` trait and other public interfaces to remove mentions of block/state/warp sync requests, replacing them with generic APIs, such that strategies where warp sync is not applicable don't have to provide dummy method implementations, etc. ## Integration Downstream projects will have to write a bit of boilerplate calling `build_polkadot_syncing_strategy` function to create previously default syncing strategy. ## Review Notes Please review PR through individual commits rather than the final diff, it will be easier that way. The changes are mostly just moving code around one step at a time. # Checklist * [x] My PR includes a detailed description as outlined in the "Description" and its two subsections above. * [x] My PR follows the [labeling requirements]( https://github.com/paritytech/polkadot-sdk/blob/master/docs/contributor/CONTRIBUTING.md#Process ) of this project (at minimum one label for `T` required) * External contributors: ask maintainers to put the right label on your PR. * [x] I have made corresponding changes to the documentation (if applicable) (cherry picked from commit 43cd6fd)

# Description This is a continuation of #5666 that finally fixes #5333. This should allow developers to create custom syncing strategies or even the whole syncing engine if they so desire. It also moved syncing engine creation and addition of corresponding protocol outside `build_network_advanced` method, which is something Bastian expressed as desired in #5 (comment) Here I replaced strategy-specific types and methods in `SyncingStrategy` trait with generic ones. Specifically `SyncingAction` is now used by all strategies instead of strategy-specific types with conversions. `StrategyKey` was an enum with a fixed set of options and now replaced with an opaque type that strategies create privately and send to upper layers as an opaque type. Requests and responses are now handled in a generic way regardless of the strategy, which reduced and simplified strategy API. `PolkadotSyncingStrategy` now lives in its dedicated module (had to edit .gitignore for this) like other strategies. `build_network_advanced` takes generic `SyncingService` as an argument alongside with a few other low-level types (that can probably be extracted in the future as well) without any notion of specifics of the way syncing is actually done. All the protocol and tasks are created outside and not a part of the network anymore. It still adds a bunch of protocols like for light client and some others that should eventually be restructured making `build_network_advanced` just building generic network and not application-specific protocols handling. ## Integration Just like #5666 introduced `build_polkadot_syncing_strategy`, this PR introduces `build_default_block_downloader`, but for convenience and to avoid typical boilerplate a simpler high-level function `build_default_syncing_engine` is added that will take care of creating typical block downloader, syncing strategy and syncing engine, which is what most users will be using going forward. `build_network` towards the end of the PR was renamed to `build_network_advanced` and `build_network`'s API was reverted to pre-#5666, so most users will not see much of a difference during upgrade unless they opt-in to use new API. ## Review Notes For `StrategyKey` I was thinking about using something like private type and then storing `TypeId` inside instead of a static string in it, let me know if that would preferred. The biggest change happened to requests that different strategies make and how their responses are handled. The most annoying thing here is that block response decoding, in contrast to all other responses, is dependent on request. This meant request had to be sent throughout the system. While originally `Response` was `Vec<u8>`, I didn't want to re-encode/decode request and response just to fit into that API, so I ended up with `Box<dyn Any + Send>`. This allows responses to be truly generic and each strategy will know how to downcast it back to the concrete type when handling the response. Import queue refactoring was needed to move `SyncingEngine` construction out of `build_network` that awkwardly implemented for `SyncingService`, but due to `&mut self` wasn't usable on `Arc<SyncingService>` for no good reason. `Arc<SyncingService>` itself is of course useless, but refactoring to replace it with just `SyncingService` was unfortunately rejected in #5454 As usual I recommend to review this PR as a series of commits instead of as the final diff, it'll make more sense that way. # Checklist * [x] My PR includes a detailed description as outlined in the "Description" and its two subsections above. * [x] My PR follows the [labeling requirements]( https://github.com/paritytech/polkadot-sdk/blob/master/docs/contributor/CONTRIBUTING.md#Process ) of this project (at minimum one label for `T` required) * External contributors: ask maintainers to put the right label on your PR. * [x] I have made corresponding changes to the documentation (if applicable)

nazar-pc added 8 commits September 10, 2024 18:59

Move syncing strategy construction outside syncing engine

1bc87f6

Minor refactoring

067b51b

Move state_request_protocol_name out of syncing engine into syncing…

1594a02

… strategy

Move warp_sync_protocol_name out of syncing engine into syncing str…

887cbe8

…ategy

Extract PolkadotSyncingStrategy instantiation into separate function

d705342

Rename generics

7ac5d79

Make syncing strategy an argument of build_network function

d7f9901

Add prdoc

1ce2caf

dmitry-markin added the T0-node This PR/Issue is related to the topic “node”. label Sep 10, 2024

Fix formatting

03366ec

dmitry-markin approved these changes Sep 12, 2024

View reviewed changes

dmitry-markin requested a review from bkchr September 12, 2024 15:10

Merge remote-tracking branch 'upstream/master' into syncing-strategy-…

f56be45

…refactoring-part-2 # Conflicts: # polkadot/node/service/src/lib.rs

lexnv reviewed Sep 13, 2024

View reviewed changes

lexnv approved these changes Sep 13, 2024

View reviewed changes

shamil-gadelshin mentioned this pull request Sep 13, 2024

State sync up to a specific block. #4407

Closed

Merge branch 'master' into syncing-strategy-refactoring-part-2

5f32ac0

dmitry-markin added this pull request to the merge queue Sep 17, 2024

Merged via the queue into paritytech:master with commit 43cd6fd Sep 17, 2024
206 of 208 checks passed

nazar-pc deleted the syncing-strategy-refactoring-part-2 branch September 17, 2024 12:25

nazar-pc mentioned this pull request Sep 17, 2024

Syncing strategy refactoring (part 3) #5737

Merged

3 tasks

bkchr reviewed Sep 17, 2024

View reviewed changes

nazar-pc added a commit to nazar-pc/polkadot-sdk that referenced this pull request Sep 20, 2024

Rename build_network to build_network_advanced and restore `build…

0ac0ddf

…_network` API before paritytech#5666

nazar-pc mentioned this pull request Sep 30, 2024

Upgrade Substrate to stable2409 autonomys/subspace#3078

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syncing strategy refactoring (part 2) #5666

Syncing strategy refactoring (part 2) #5666

nazar-pc commented Sep 10, 2024 •

edited by dmitry-markin

Loading

nazar-pc commented Sep 10, 2024

dmitry-markin commented Sep 12, 2024

lexnv Sep 13, 2024

nazar-pc Sep 13, 2024 •

edited

Loading

lexnv left a comment

nazar-pc commented Sep 17, 2024

bkchr Sep 17, 2024

nazar-pc Sep 17, 2024

nazar-pc Sep 17, 2024

bkchr Sep 17, 2024

nazar-pc Sep 20, 2024

Syncing strategy refactoring (part 2) #5666

Syncing strategy refactoring (part 2) #5666

Conversation

nazar-pc commented Sep 10, 2024 • edited by dmitry-markin Loading

Description

Integration

Review Notes

Checklist

nazar-pc commented Sep 10, 2024

dmitry-markin commented Sep 12, 2024

lexnv Sep 13, 2024

Choose a reason for hiding this comment

nazar-pc Sep 13, 2024 • edited Loading

Choose a reason for hiding this comment

lexnv left a comment

Choose a reason for hiding this comment

nazar-pc commented Sep 17, 2024

bkchr Sep 17, 2024

Choose a reason for hiding this comment

nazar-pc Sep 17, 2024

Choose a reason for hiding this comment

nazar-pc Sep 17, 2024

Choose a reason for hiding this comment

bkchr Sep 17, 2024

Choose a reason for hiding this comment

nazar-pc Sep 20, 2024

Choose a reason for hiding this comment

nazar-pc commented Sep 10, 2024 •

edited by dmitry-markin

Loading

nazar-pc Sep 13, 2024 •

edited

Loading