Skip to content

Latest commit

 

History

History
165 lines (122 loc) · 8.88 KB

README.md

File metadata and controls

165 lines (122 loc) · 8.88 KB

Bacalhau

The Distributed Computation Framework ⚡
Compute Over Data (CoD)


total download Bacalhau contributors Bacalhau website follow on Twitter

Bacalhau is a platform for fast, cost efficient, and secure computation by running jobs where the data is generated and stored. With Bacalhau you can streamline your existing workflows without the need of extensive rewriting by running arbitrary Docker containers, WebAssembly (wasm) images, or arbitrary binaries as tasks.

Table of Contents

Why Bacalhau?

  • Fast job processing: Jobs in Bacalhau are processed where the data was created and all jobs are parallel by default.
  • 💰 Low cost: Reduce (or eliminate) ingress/egress costs since jobs are processed closer to the source. Take advantage of as well idle computation capabilities at the edge.
  • 🔒 Secure: Data scrubbing and security can happen before migration to reduce the chance of leaking private information, and with a far more granular, code-based permission model.
  • 🚛 Large-scale data: Bacalhau operates on a network of open compute resources made available to serve any data processing workload. With Bacalhau, you can batch process petabytes (quadrillion bytes) of data.

Getting started - Bacalhau in 1 minute

Go to the folder directory that you want to store your job results

Install the bacalhau client

curl -sL https://get.bacalhau.org/install.sh | bash

Submit a "Hello World" job

bacalhau docker run ubuntu echo Hello World

Download your result

bacalhau get 63d08ff0..... # make sure to use the right job id from the docker run command

For a more detailed tutorial, check out our Getting Started tutorial.

Learn more

Documentation

📚 Read the Bacalhau docs guide here! 📚

The Bacalhau docs is the best starting point as it contains all the information to ensure that everyone who uses Bacalhau is doing so efficiently.

Developers guide

Running Bacalhau locally

Developers can spin up bacalhau and run a local demo using the devstack command.

Please see running_locally.md for instructions. Also, see debugging_locally.md for some useful tricks for debugging.

Notes for Dev contributors

Bacalhau's CI pipeline performs a variety of linting and formatting checks on new pull requests. To have these checks run locally when you make a new commit, you can use the precommit hook in ./githooks:

make install-pre-commit

# check if pre-commit works
make precommit

If you want to run the linter manually:

curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sudo sh -s -- -b /usr/local/go/bin
golangci-lint --version
make lint

The config lives in .golangci.yml

OpenAPI

OpenAPI v2 annotations sit by the endpoints in pkg/publicapi; these are built using swag, a Go converter for Swagger documentation. Find more details about the Swag annotations in their docs. The swagger specification is built automatically by the CI pipeline (see the build_swagger workflow) but you can trigger a local build with make swagger-docs.

The build parses the OpenAPI annotations as well as the markdown files in docs/swagger/ (containing long-form descriptions of the API endpoints), and generates the following swagger specification files:

  • docs/docs.go
  • docs/swagger.json
  • docs/swagger.yaml

Python Libraries

We ship two Python Bacalhau libraries:

  • bacalhau-apiclient wraps only the API endpoint calls and request/response models. It's autogenerated from the OpenAPI specification (see paragraph above). Read more about it in its readme.
  • bacalhau-sdk is a high-level Bacalhau SDK that ships all the client-side logic (e.g. signing requests) needed to query the endpoints. Its examples folder contains code snippets to create, list and inspect jobs. Under the hood, it uses the bacalhau-apiclient to call the API. Please use this library in your projects. Read more about it in its readme.

Issues, feature requests, and questions

We are excited to hear your feedback!

  • For issues and feature requests, please open a GitHub issue.
  • For questions, give feedback or answer questions that will help other user product please use GitHub Discussions.
  • To engage with other members in the community, join us in our slack community #bacalhau channel 🙋

Ways to contribute

All manner of contributions are more than welcome!

We have highlighted the different ways you can contribute in our contributing guide. You can be part of community discussions, development, and more.

Open Source

This repository contains the Bacalhau software, covered under the Apache-2.0, except where noted (any Bacalhau logos or trademarks are not covered under the Apache License, and should be explicitly noted by a LICENSE file.)

Bacalhau is a product produced from this open source software, exclusively by Expanso, Inc. It is distributed under our commercial terms.

Others are allowed to make their own distribution of the software, but they cannot use any of the Bacalhau trademarks, cloud services, etc.

We explicitly grant permission for you to make a build that includes our trademarks while developing Bacalhau software itself. You may not publish or share the build, and you may not use that build to run Bacalhau software for any other purpose.

We have borrowed the above Open Source clause from the excellent System Initiative