Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: market: Register a custom shard indexer for the dagstore #10674

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

willscott
Copy link
Contributor

@willscott willscott commented Apr 16, 2023

This change will result in lotus parsing deals for local indexing according to filecoin-project/FIPs#512

Related Issues

This is a continuation of filecoin-project/dagstore#154 - This piece of the logic was seen as filecoin-specific rather than part of common dagstore functionality. A parallel to this logic is expected to be ported also to boost, but there remain confguration in the use of the dagstore where this is the common locations where this indexing functionality can be registered.

Proposed Changes

  • Register a ShardIndexer for the dagstore - such that when a shard is indexed it is considered both for indexing with a data segment index and the existing direct-car indexing.

Additional Info

cc @Kubuxu @dirkmc

Checklist

Before you mark the PR ready for review, please make sure that:

  • Commits have a clear commit message.
  • PR title is in the form of of <PR type>: <area>: <change being made>
    • example: fix: mempool: Introduce a cache for valid signatures
    • PR type: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, test
    • area, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps
  • New features have usage guidelines and / or documentation updates in
  • Tests exist for new functionality or change in behavior
  • CI is green

@willscott willscott force-pushed the feat/data-segment-index branch 2 times, most recently from 678e4ea to fffb877 Compare April 16, 2023 09:27
…gment indices when present.

This change will result in lotus parsing deals for local indexing according to filecoin-project/FIPs#512

This is a continuation of filecoin-project/dagstore#154 - This piece of the logic was seen as filecoin-specific rather than part of common dagstore functionality.
A parallel to this logic is expected to be ported also to boost, but there remain confguration in the use of the dagstore where this is the common locations where this indexing
functionality can be registered.
willscott added a commit to filecoin-project/boost that referenced this pull request Jun 6, 2023
this makes boost aware of the format in
https://github.com/filecoin-project/go-data-segment/
for new deals.

this is the same logic as in filecoin-project/lotus#10674
willscott added a commit to filecoin-project/boost that referenced this pull request Jun 7, 2023
this makes boost aware of the format in
https://github.com/filecoin-project/go-data-segment/
for new deals.

this is the same logic as in filecoin-project/lotus#10674
@jennijuju
Copy link
Member

I understand this is in draft and assume it won’t be open (given lotus stopped supporting market/deal marketing functionalities since Jan) - but am curious why we are prototyping this in lotus instead of Boost?

@willscott
Copy link
Contributor Author

This is the current location where the dagstore / metadata occurs. If we're supporting legacy configurations, this is the 'compatible' location for such code to live.

currently we're only targeting newer boost installations, so the active branch is filecoin-project/boost#1495

willscott added a commit to filecoin-project/boost that referenced this pull request Jun 27, 2023
this makes boost aware of the format in
https://github.com/filecoin-project/go-data-segment/
for new deals.

this is the same logic as in filecoin-project/lotus#10674
willscott added a commit to filecoin-project/boost that referenced this pull request Jun 28, 2023
this makes boost aware of the format in
https://github.com/filecoin-project/go-data-segment/
for new deals.

this is the same logic as in filecoin-project/lotus#10674
willscott added a commit to filecoin-project/boost that referenced this pull request Jul 5, 2023
this makes boost aware of the format in
https://github.com/filecoin-project/go-data-segment/
for new deals.

this is the same logic as in filecoin-project/lotus#10674
LexLuthr added a commit to filecoin-project/boost that referenced this pull request Oct 9, 2023
* feat: add local index directory

Internally this is still refered to as the piece directory

Co-authored-by: dirkmc <[email protected]>
Co-authored-by: Anton Evangelatov <[email protected]>

* refactor: merge from main into lid branch (#1339)

* merge: main to lid (#1370)

* add free check (#1315)

* chore: bump version to 1.6.1 (#1317)

* fix legacy deal verified status (#1324)

* fix: update go-unixfsnode enough to make sure unixfs-preload is available (#1323)

* release v1.6.2-rc1 (#1328)

* use full path (#1330)

* fix bug (#1332)

* use forks of graphsync, go-data-transfer and go-fil-markets (#1333)

* refactor: use forks of graphsync, go-data-transfer and go-fil-markets

* refactor: convert from data transfer v1 to v2 voucher type

* fix: index provider validation voucher type

* fix: pass index provider engine link system through to graphsync's transport configurer

* feat: use tagged version of boost-gfm

* fix: retrieval client imports

* feat: tagged version of lotus

* feat: require go 1.19

* lint: fix lint errors

* fix: itests

* fix: cbor-gen, docsgen

* fix: update CI lint version

* fix: lint

* fix: docgen

* fix: go mod tidy

* fix: protocol proxy TestOutboundForwarding

* fix: docsgen

* fix: update filecoin-ffi submodule

* fix: prometheus duplicate register panic

* fix: cleanup imports

* fix: legs voucher processing

* chore: release v1.6.2-rc2 (#1340)

* release v1.6.2-rc2

* fix test

* fix: flaky TestLibp2pCarServerNewTransferCancelsPreviousTransfer (#1350)

* fix: flaky TestDealCompletionOnProcessResumption (#1351)

* fix: occasional panic on shutdown (#1353)

* feat: query UI (#1352)

* log insert

* fix display error

* refactor code

* shorten status strings

* remove comment

* apply suggestion

* feat: add download block link to inspect page (#1312)

* fix(devnet): update golang and lotus default versions (#1354)

* fix(devnet): bump golang to 1.19

* chore(devnet): bump lotus default version

* chore(devnet): remove unused stable env

* booster-http: implement IPFS HTTP gateway (#1225)

* feat: implement http api gateway

* feat: use go-libipfs lib (instead of copying to extern)

* feat: bump booster-bitswap info minor version

* feat: http gateway metrics

* fix: TestHttpInfo

* feat: by default only serve blocks and CARs, with option to serve original files (jpg, mov etc)

* fix: correct link for download root block (#1355)

* feat: option to cleanup data for offline deals after add piece (#1341)

* chore: add support for multiple node.js versions in makefile (#1356)

* chore: release v.1.7.0-rc1 (#1357)

* release v.1.7.0-rc1

* fix version

* fix: dagstore initialize-all parameter (#1363)

* fix: show verifying commp state for offline deals (#1364)

* fix: boost run missing staging-area dir (#1368)

* merge(wip): main to lid

TODO: remoteblockstore needs to handle nil metrics

* fix: flaky TestNewHttpServer (#1372)

* feat: group agent version by binary name (#1369)

* fix: wrap stats in nil checks for now

we should probably revisit how stats are handled now that we have all 3 transports being tracked

* test(fix): incorrect test urls

---------

Co-authored-by: LexLuthr <[email protected]>
Co-authored-by: Rod Vagg <[email protected]>
Co-authored-by: dirkmc <[email protected]>

* fix: make devnet work for lid (#1375)

* feat: support full addr config in boostd-data

* chore: fix linting for boostd-data

* feat: use addr instead of port for lid

chore: update devnet to work with lid setup

* chore: resolve feedback on lint changes

* feat: fail deal if start epoch passed (#1319)

* fail deal if start epoch passed

* add suggestion

* test: add deal expiry on startup test

---------

Co-authored-by: Dirk McCormick <[email protected]>

* fix: makefile

* fix: db migration ordering

* fix: correct rootcid formatting

* fix: prevent accidental removal of valid sector index announcements

fix: add cache tests and dont announce cache state
fix: add unique index to sector state db
fix: sealed and unsealed sector state conflict
fix: ensure index provider wrapper starts after db migration has completed

* chore: go mod tidy

* fix: download block (#1440)

* LID yugabyte db impl (#1391)

* feat: yugabyte db impl

* feat: run yugabyte tests against a dockerized yugabyte

* fix: use out own yugabyte docker image

* fix: use yugabyte 2.17.2.0 docker image

* feat: piece doctor yugabyte impl

* fix: go mod tidy

* refactor: remove SetCarSize as its not longer being used

* refactor: remove functionality to mark index as errored (not being used)

* feat: implement delete commands

* refactor: consolidate test params

* feat: add lid yugabyte config

* fix: port map yugabyte postgres to standard port

* Fix yugabyte CI (#1433)

* fix: yugabyte tests in CI

* docker-compose.yml ; Dockerfile.test ; connect to `yugabyte` and not localhost

* add tag

* test lid

* make gen

* fixup

* move couchbase settings under build tag

---------

Co-authored-by: Anton Evangelatov <[email protected]>

---------

Co-authored-by: Anton Evangelatov <[email protected]>

* script to migrate from couchbase to yugabyte (#1445)

* feat: script to migrate from couchbase to yugabyte

* fix: reduce batch size for yugabyte inserts

* Change service GetIndex / AddIndex to return channel instead of array (#1444)

* feat: yugabyte db impl

* feat: run yugabyte tests against a dockerized yugabyte

* fix: use out own yugabyte docker image

* fix: use yugabyte 2.17.2.0 docker image

* feat: piece doctor yugabyte impl

* fix: go mod tidy

* refactor: remove SetCarSize as its not longer being used

* refactor: remove functionality to mark index as errored (not being used)

* feat: implement delete commands

* refactor: consolidate test params

* feat: add lid yugabyte config

* fix: port map yugabyte postgres to standard port

* Fix yugabyte CI (#1433)

* fix: yugabyte tests in CI

* docker-compose.yml ; Dockerfile.test ; connect to `yugabyte` and not localhost

* add tag

* test lid

* make gen

* fixup

* move couchbase settings under build tag

---------

Co-authored-by: Anton Evangelatov <[email protected]>

* wip: service GetIndex returns channel of records instead of array

* feat: return channel from AddIndex and GetIndex

---------

Co-authored-by: Anton Evangelatov <[email protected]>

* local index directory: recover tool (#1410)

* initial disaster recovery tool for LID

* wip

* do not block on individual error

* instantiate lid

* report

* catch signal

* fixup

* comment out sector already in progress

* fixup

* start containers with init: true

* record that we dont have an unsealed copy

* match deals with boost sqlite db and piece store

* fixup

* fixup

* use logger

* fixup

* disable stacktrace

* fixup

* extract piece store away from disaster recovery struct

* add more sanity checks

* compare IsUnsealed vs storage find

* improve safeIsUnseal

* fixup

* better logs

* expand repodir

* calc properly next offset

* fixup

* add sector id to logs

* incr offset

* break after finding expired deal

* more logs

* fewer logs

* better logs

* better error

* refactor

* refactor minerApi

* better logs

* add time around add index

* pd.Start

* LID benchmarking tool (#1276)

* feat: LID benchmarking tool

* fix: bench thread safety

* refactor: structured logging

* refactor: postgres bulk insert

* lid bench: Add foundationdb impl

* lid fdb: Fix Tx sizing, parallel chunk puts

* lid fdb: More efficient sample generation

* feat: array of piece count / blocks per piece (#1314)

* lid bench: print add rate

* lid bench: Add retry to postgres put (#1316)

* lid bench: Make cassandra put much more robust (#1318)

* instrumentation for bench tool (#1337)

* instrument postgres

* more instrumentation

* check for err getoffsetsize

* emit metrics every 10sec

* ignore errors

* add postgres-drop

* use directly tables

* fix: go mod tidy

* use INSERT INTO instead of tmp tables

* try to catch sig

* remove transaction commit

* fixup

* add postgres-init

* fixuop

* split create and init

* fixup

* remove if not exist

---------

Co-authored-by: Dirk McCormick <[email protected]>

* feat: batch insert queries for postgres

* feat: add flag to insert into postgres using tmp table

* refactor: merge changes from nonsense/lid-bench

* refactor: just use one database (dont create bench database)

* refactor: remove unused params

* refactor: command structure

* fix: cassandra - dont use batch insert for PayloadToPieces

* fix: create tables CQL

* fix: increase payload to pieces insert parallelism

* fix: use simple replication strategy

* feat: use yugabyte cassandra driver

* fix: remove bench binary

* update metrics endpoint

* fix random generated piece cid

* fixup

* fix: cassandra bitswap benchmark

* remove foundationdb

---------

Co-authored-by: Łukasz Magiera <[email protected]>
Co-authored-by: Łukasz Magiera <[email protected]>
Co-authored-by: Anton Evangelatov <[email protected]>

* fix: failing tests due to bad merge

* fix: flaky TestMultipleDealsConcurrent

* more logs

* piece doctor and sector state manager refactor (#1463)

* fix timer.Reset and improve logs

* revert randomization

* piece doc: handle errors

* adjust piece check

* refactor unsealsectormanager

* refactor piece doctor

* add random ports

* ignore tests

* add version to boostd-data

* fix ctx in Start

* fix: add reader mock to fix tests

* fix: pass new piece directory to provider on test restart

* fix synchronisation

* note that panics are not propagated in tests

* carv1 panics piece directory

* print panics

* fix: use reader that supports Seek in piece reader mock

* fix: reset mock car reader on each invocation

* fix: TestOfflineDealDataCleanup

* add check for nil cancel func

* bump min check period for LevelDB to 5 minutes

* check if sector state mgr is initialised

* debug line for unflagging

* commenting out TestMultipleDealsConcurrent -- flaky test -- works locally

* add SectorStateUpdates pubsub

* add close for pubsub

* add mock sectorstatemgr

* add wrapper tests

* fixup

* cleanup

* cleanup

* better names

* t.Skip for test

* remove TODO above println for panic

* add unit tests for refreshState

* rename tests

* more cases

* more tests

* update description

* better comment

* better names and comments

---------

Co-authored-by: Dirk McCormick <[email protected]>

* Merge from main to lid branch (#1483)

* fix statx output string (#1451)

* fix: flaky TestMultipleDealsConcurrent (#1458)

* Add option to serve index provider ads over http (#1452)

* feat: option to serve index provider ads over http

* fix: config naming, hostname parsing

* fix: update docsgen

* fix: log announce address

* feat: add config for indexer direct announce urls

* refactor: always announce over pubsub

* fix: docsgen

* test: add test case for empty announce address hostname

* Add `boostd index announce-latest` command (#1456)

* feat: boostd index announce-latest

* feat: add announce-latest-http command

* fix: default direct announce url

* feat: update to index-provider v0.11.2

* Signal to index provider to skip announcements (#1457)

* fix: signal to index provider to skip announcements

* fix: ensure multihash lister skip error is of type ipld.ErrNotExists

---------

Co-authored-by: LexLuthr <[email protected]>

* release v1.7.3-rc2 (#1460)

* fix: improve stalled retrieval cancellation (#1449)

* refactor stalled retrieval cancel

* add ctx with timeout

* implement suggestions

* update err wrapping

* fix: set short cancel timeout for unpaid retrievals only

---------

Co-authored-by: Dirk McCormick <[email protected]>

* feat: enable listen address for booster-http (#1461)

* enable listen address

* modify tests

* fix nil ptr (#1470)

* fix: incorrect check when import offline deal data using proposal CID (#1473)

* fix incorrect early check

* update error msg

* fix(server): properly cancel graphsync requests (#1475)

* set UI default listen address to localhost (#1476)

* feat: display msg params in the mpool UI (#1471)

* show msg params

* fix: mpool nil pointer

* fix width

---------

Co-authored-by: Dirk McCormick <[email protected]>

* Reset read deadline after reading deal proposal message (#1479)

* fix: reset read deadline after reading deal proposal message

* fix: increase client request deadline

* feat: Show elapsed epoch and PSD wait epochs in UI (#1480)

* show epochs

* fix devnet UI, use BlockdDelaySecs

* fix lint err

* Update gql/resolver.go

Co-authored-by: dirkmc <[email protected]>

---------

Co-authored-by: dirkmc <[email protected]>

* release v1.7.3-rc3 (#1481)

---------

Co-authored-by: LexLuthr <[email protected]>
Co-authored-by: LexLuthr <[email protected]>
Co-authored-by: Hannah Howard <[email protected]>

* update local index directory ui (#1477)

* feat: update local index directory ui

* comment out wrench as docker doesnt build

* rearrange menu

* refactor: remove sectors list

---------

Co-authored-by: Anton Evangelatov <[email protected]>

* feat: surface indexing errors (#1490)

* feat: log panic (instead of just printing to stdout) (#1491)

* split flagged pieces into unsealed/sealed tables (#1493)

* refactor: remove couchbase tests (#1496)

* refactor: remove piece directory couchbase tests (#1497)

* GraphQL resolvers for LID (#1494)

* wip

* rename

* sectorUnsealedCopies and SectorProvingState

* fix: piece directory tests (#1498)

* log line for only sealed sectors

* more logs

* feat: flagged pieces (#1501)

* check that sector has deals for unsealed sectors (#1502)

* check that sector has deals for unsealed sectors

* simplify

* rename heading

* piece doctor to ignore expired/slashed deals (#1503)

* ignore expired/slashed deals

* fix mocks

* add timer for checkPiece

* move ChainHead away from checkPiece

* add nil check for fullnodeApi

* add debug line

* fix pagination

* LID landing page: add stats around Flagged and non-Flagged pieces (#1508)

* wip

* fixup

* add debug line

* fixup

* feat: split flagged pieces page into flagged / flagged because unsealed (#1509)

* fix: display of no flagged pieces (#1511)

* disable dummy panels - block stats; deal data (#1510)

* fix unsealed field in flagged piece (#1515)

* update ffi

* fix main merge issue

* fix go mod

* Add info boxes on LID UI page (#1516)

* feat: add info boxes on LID UI page

* Update react/src/LID.js

Co-authored-by: Anton Evangelatov <[email protected]>

* Update react/src/LID.js

Co-authored-by: Anton Evangelatov <[email protected]>

---------

Co-authored-by: Anton Evangelatov <[email protected]>

* feat: replace migrate couchbase command with migrate yugavbyte (#1518)

* remove redundant makefile (#1519)

* remove redundant makefile

* add migrate-lid to Makefile

* update gitignore

* move booster-bitswap and booster-http to make and make install

* fix: inspect page - dont try to fetch root cid (#1525)

* feat: add send epoch, time, elapsed epoch and elapsed time for each message in mpool to UI (#1523)

* add message epoch/time details

* implement suggestion

* use moment lib

* fix alerting bug

* update polling interval

* add logs

* fix devnet: use ws instead of http to connect to boostd-data

* feat: make legacy deals optional (#1524)

* make legacy deals optional

* fix gen

* modify itests, create new

* handle legacy stream explicitly

* separate out the protocols

* fix lint error

* enable itest in CI

* fix ci

* apply suggestions

* fix error after conflict resolution

* refactor: simplify legacy deal response code

---------

Co-authored-by: Dirk McCormick <[email protected]>

* refactor: remove couchbase implementation (#1535)

* Update lotus and boxo versions (#1466) (#1537)

* Update to use packages in go-libipni

* feat: update lotus version

* update boxo (#1492)

* feat: update boxo

* refactor: depend on repo:Jorropo/lotus branch:boxo2

* chore: temporarily update go-fil-markets with replace directive

* feat: switch itests framework ExtractFileFromCAR to use non-global IPLD registry

* feat: switch booster-bitswap client fetch to use the go-ipld-prime globals via go-ipld-legacy

* go fmt

* chore: update dependencies and migrate to boxo

* fix: update boost-gfm

* fix: stop itests framework from prematurely setting listenaddrs via go-libp2p defaults that conflict with lotus

* fix: docs gen

* chore(deps): update deps for boxo v0.10.0

* chore(deps): update boost-gfm

* fix(booster-http): update for boxo v0.10.0

* chore(deps): update to remove kubo dependency

* fix(gen): update docs gen

* feat: update boost-gfm to v1.26.6

* chore(deps): update lotus to master

---------




---------

Co-authored-by: gammazero <[email protected]>
Co-authored-by: Adin Schmahmann <[email protected]>
Co-authored-by: hannahhoward <[email protected]>

* feat: update boost-gfm to v1.26.7 (#1538)

* fix: piece doctor tests (#1540)

* refactor: build indexes for legacy deals (#1539)

* feat: http index announcements (#1418)

* feat(indexprovider): announce http transport

refactor: isolate extended provider logic

feat: announce http indexes

refactor(indexprovider): use metadata.Default

fix(wrapper): fix compile error

* fix http ep signing bug

* update comment

---------

Co-authored-by: LexLuthr <[email protected]>

* feat: check unseal status of piece through both apis (#1548)

* fix: metrics and Grafana (#1546)

* fix grafna, metrics

* remove dagstore from name

* fix: add missing PieceDeal (PieceCid) index (#1551)

* fix: iterate all deals to index piece (#1549)

* fix: iterate all deals to index piece

* add test, use multierror

* add and update comments

* refactor: separate yugabyte / leveldb tests for easier local testing (#1553)

* Parse indexed deals for data segment index

this makes boost aware of the format in
https://github.com/filecoin-project/go-data-segment/
for new deals.

this is the same logic as in filecoin-project/lotus#10674

* first pass at segment fixture

* fix compile issues in test

* test passes

* additional seek on fallback

* code review

* remaining test fix

* refactor: simplify data segment index test

* update to tagged go-data-segment

* feat: refactor mpool page in UI (#1530)

* modify GQL

* fix count type

* fix locks

* fix js

* migrate config to v5 (#1560)

* migrate config to v5

* change default version

* chore: release v2.0.0-rc1 (#1561)

* Upgrade to index-provider v0.13.4 (#1559)

Upgrade to the latest index-provider library.

* itest for data segment

* enable test in circleCI

* feat: add IPNI itest (#1563)

* ipni itest

* refactor test

* add to circleCI

* add indexer topic

* Print protocol IDs exposed by f.Boost

* generate topic name dynamically

---------

Co-authored-by: Masih H. Derkani <[email protected]>

* fix file comparison

* Add test for second segment

* IPNI UX (#1562)

* feat: IPNI UX

* Update react/src/Ipni.js

Co-authored-by: LexLuthr <[email protected]>

* feat: server side config

---------

Co-authored-by: LexLuthr <[email protected]>

* fix: data segment offset calculation

* fix itest to use carv1

* change itest to use 1.5 mb file

* use fixtures

* use fixture.dat generated by Will

* Remove carv1/carv2 specific semantics from commP calculations

The commp should be directly over the bytes of the deal, it shouldn't care about car format.

* fail on empty files

* fix another instance where deals are assumed to be car files

* And fix a test assertion per expectation

* Update recorded offsets when indexing a carv2 file

* update to play nicely between absolute index and car blockstore

* better logging on read errors

* additional logging

* check if offset not properly accounted

* try other direction

* remove debugging

* replay on get errors

* lint

* see if this identifies the unexpected behavior in skipNext

* offbyone

* print buffer of object

* get slice around target area

* fix: use fixed car BlockReader#SkipNext SourceOffset (#1659)

* fix: use fixed car BlockReader#SkipNext SourceOffset

Ref: ipld/go-car#491

* chore: regenerate cbor-gen types w/ cbor-gen update

* fix compilation errors

* feat: add corshandler to IPFS gateway (#1589)

* add corshandler to IPFS gateway

* use cors lib

* go mod tidy

* refactor test, remove extra cors

* fix a couple files that failed on the merge to master

* fix: CARv2 deal and retrieval support fixes

* fix: make test work

* feat: show both sealing and index / announce status, if there's an error announcing (#1699)

* require data segmetn index to have a valid car at each segment

* force only car deals in commp

* use expected ffi

---------

Co-authored-by: Jacob Heun <[email protected]>
Co-authored-by: dirkmc <[email protected]>
Co-authored-by: Anton Evangelatov <[email protected]>
Co-authored-by: Jacob Heun <[email protected]>
Co-authored-by: LexLuthr <[email protected]>
Co-authored-by: Rod Vagg <[email protected]>
Co-authored-by: Łukasz Magiera <[email protected]>
Co-authored-by: Łukasz Magiera <[email protected]>
Co-authored-by: LexLuthr <[email protected]>
Co-authored-by: Hannah Howard <[email protected]>
Co-authored-by: gammazero <[email protected]>
Co-authored-by: Adin Schmahmann <[email protected]>
Co-authored-by: Masih H. Derkani <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants