You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the definition of the dependencies between different steps of the pipeline is done inside the core package at crawl_path.mjs.
This leads to the issue that contributors that want to write strategies will necessarily have to modify both the strategies package and the core package, whilst, given my understanding of the architecture, core should just represent the crawling boilerplate and orchestration component, unaware of the specifics of the implemented strategies.
The more pipelines will be integrated, the more the complexity of their dependency will increase. Already at this point there is a non-trivial graph defining how the soundxyz + zora pipelines should work: it involves a common father step involving the web3subgraph crawling, which branches out to the crawling of the two platforms, to then merge again in the musicosaccumulator.
Further down the road there might be other platforms that require to be integrated in this same pipeline and with additional in-between transformation steps.
Having a way to clearly define the dependency graph on the strategy repository would help future contributor understanding how their new strategy should be integrated, either by adding it on an existing graph or by creating a whole independent one (such as the get-xkcd crawler).
The text was updated successfully, but these errors were encountered:
Problem statement
Currently, the definition of the dependencies between different steps of the pipeline is done inside the
core
package at crawl_path.mjs.This leads to the issue that contributors that want to write strategies will necessarily have to modify both the
strategies
package and thecore
package, whilst, given my understanding of the architecture,core
should just represent the crawling boilerplate and orchestration component, unaware of the specifics of the implemented strategies.Mitigation proposal
Move crawl_path.mjs inside
strategies
.Technical credit
The more pipelines will be integrated, the more the complexity of their dependency will increase. Already at this point there is a non-trivial graph defining how the
soundxyz
+zora
pipelines should work: it involves a common father step involving theweb3subgraph
crawling, which branches out to the crawling of the two platforms, to then merge again in themusicosaccumulator
.Further down the road there might be other platforms that require to be integrated in this same pipeline and with additional in-between transformation steps.
Having a way to clearly define the dependency graph on the
strategy
repository would help future contributor understanding how their new strategy should be integrated, either by adding it on an existing graph or by creating a whole independent one (such as theget-xkcd
crawler).The text was updated successfully, but these errors were encountered: