Skip to content

Bacalhau project report 20220902

lukemarsden edited this page Sep 2, 2022 · 4 revisions

Finished Master Plan - Part 1!

Filecoin (Lotus) and Estuary publishers

We now have publishers for Filecoin and Estuary. This means that a properly configured Bacalhau server can publish jobs (when requested) to these backends for long-term storage/backup of results.

This means a properly configured Service Provider will be able to read from unsealed copies of Filecoin data using the filecoin unsealed storage driver (read side) and persist the result back to Filecoin using the Filecoin Lotus publisher (write side)!

Filecoin+ APIs and Dashboard

We have developed a Filecoin+ API and Dashboard per the previously described Filecoin+ plan.

Here it is!

image (13)

This dashboard shows jobs that are running on the Bacalhau network, and flags ones which are using Filecoin+ data as input for consideration to be counted towards Datacap for Filecoin+ storage deals for their outputs as well. This will allow Service Providers to differentiate themselves to customers by offering compute over the data being stored on them, as well as publishing of the results back into Filecoin+ which acts as an incentive for the SPs themselves to run more jobs, generate more output, and more verified Filecoin+ storage that ultimately benefits humanity by storing valuable derivitives of important datasets -- securely, globally.

The dashboard (which shows Bacalhau jobs, and Filecoin+ eligible jobs) is powered by a new API which returns:

  • A stream of jobs running on the Bacalhau network
  • Flagging jobs which have input data from Filecoin+
  • Showing the output of those jobs
  • Including the JobSpecs of those jobs so that e.g. the source code transforming the data can be inspected by a notary

We continue engaging with Service Providers in order to integrate the above pieces all together for end-to-end unsealed reads, Filecoin persistence for results, and Filecoin+ as an initial incentive layer to run Bacalhau.

Security considerations

As well as the spreading algorithm (new --min-bids parameter) described in the previous report, which has now landed in the main branch, we have identified and resolved two additional attack vectors:

  1. a malicious client spamming the public API of the nodes, which we'll solve with rate limiting at the REST level
  2. a malicious requestor node (or compute node) spamming the libp2p transport

We addressed #1 by rate limiting the REST server by IP address. This won't make it impossible to DoS the network, but it will make it harder since an attacker would have to control many IP addresses rather than being able to mount an attack just from a single machine (i.e. they'd have to do a DDoS not just a DoS).

We addressed #2 by adopting the same PeerGaterParams that Lotus uses for its libp2p pubsub implementation. Per the go-libp2p-pubsub docs:

WithPeerGater is a gossipsub router option that enables reactive validation queue management. The Gater is activated if the ratio of throttled/validated messages exceeds the specified threshold. Once active, the Gater probabilistically throttles peers before they enter the validation queue, performing Random Early Drop. The throttle decision is randomized, with the probability of allowing messages to enter the validation queue controlled by the statistical observations of the performance of all peers in the IP address of the gated peer. The Gater deactivates if there is no validation throttling occurring for the specified quiet interval.

There are many more security issues to address (such as clients consuming all of the available RAM and CPU on the network by submitting a relatively small number of large jobs), which we will address them in the next phase of the project!

Onwards to Part 2! 🚀

New starters

In order to drive the exciting work we have planned for part 2 of the master plan, we have recruited and onboarded two amazing new engineers:

  • Prasanth (Prash) Pagolu - a new core engineer with deep experience of software engineering, DevOps & architecture
  • Simon Worthington - a cryptography specialist with deep experience of distributed systems handling data for government

We are excited to welcome Prash and Simon to the team, welcome! 🎉

Threads of work

We have planned an allocation of work across the (now 7+ person) engineering team in various threads:

  • Verification protocol & marketplace
  • Smart Contract → Consensus
  • User onboarding & examples
  • UX improvements, bugfixes, docs etc
  • Resilience, APIv1, persistence
  • Getting SPs on board; and Filecoin, Filecoin+ integrations with real SPs
  • WASM - support for arbitrary wasm bytecode that can load a stream of an IPFS/Filecoin CID, and collaborating with the IPVM project

We are kicking off each of these threads in the coming weeks, and we'll talk about them in much more detail in project reports to come 😄

Clone this wiki locally