Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.1 #1

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

v0.1 #1

wants to merge 14 commits into from

Conversation

expede
Copy link
Member

@expede expede commented Dec 1, 2022

@expede expede changed the title v0.1 draft v0.1 Dec 1, 2022
@expede expede marked this pull request as draft December 1, 2022 06:45
@expede expede mentioned this pull request Dec 1, 2022
README.md Outdated
Comment on lines 347 to 389
{
"with": "ipfs://bafkreidvq3uqoxcxr44q5qhgdk5zk6jvziipyxguirqa6tkh5z5wtpesva",
"can": "docker/run",
"inputs": {
"entrypoint": ["/", "home"],
"mounts": {"mnt": "/"},
"outputs": {"tmp": "/tmp"},
"env": {"$HELLO": "world"},
"workdir": "/work"
},
"meta": {
"ipvm/config": {
"secret": false,
"check": {
"optimistic": 2,
"referee": "did:key:zStEZpzSMtTt9k2vszgvCwF4fLQQSyA15W5AQ4z3AR6Bx4eFJ5crJFbuGxKmbma4"
}
}
}
}
```

``` json
{
"with": {"ucan/promise": ["/", "previous_action"]},
"can": "docker/run",
"inputs": {
"entrypoint": ["/", "home"],
"mounts": {"mnt": "/"},
"outputs": {"tmp": "/tmp"},
"env": {"$HELLO": "world"},
"workdir": "/work"
},
"meta": {
"ipvm/config": {
"disk": "200GB"
}
}
}
```
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From comments on the old repo, I've included examples of how to both pull a Docker container off the network & run it, as well as how to do this with a pipeline.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(fixed s/can/do/ 😅 )

@expede expede marked this pull request as ready for review December 1, 2022 07:16
Copy link

@zeeshanlakhani zeeshanlakhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First run on comments. Will have more after our discussion(s).

README.md Outdated Show resolved Hide resolved

``` ipldsch
type SystemConfig struct {
secret Boolean (implicit False)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of implicit should these say defaults or defaults to as more typical?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an option in IPLD Schema https://ipld.io/docs/schemas/features/typekinds/#value-type-modifiers

The other note on IPLD Schema is that it's an atypical system that mixes types and representation pretty liberally. I'm still learning the ins-and-outs myself.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha.

| `inputs` | `Any` | | Yes | |
| `meta` | `{String : Any}` | Fields that will be ignored during memoization | No | `{}` |

An OPTIONAL IPVM `Config` MAY be included at the `meta['ipvm/config']` path. The `meta` field SHOULD not captured as part of task memoization, so this information will be omitted from the distributed invocation table. If included, the `Config` MUST set the IPVM configuration for this Task, overwriting any of the fields on the envelope's top-level `defaults` field, or system-wide defaults.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meta feels much more optional than config. Shouldn't config and/or something like turning on/off certain features be its own field, as it seems less "throw-away" than meta?

| `secret` | `Boolean or null` | Whether the output is unsafe to publish | No | `null` |
| `check` | `Verification` | Verification strategy | No | `"attestation"` |
| `time` | `TimeInterval` | Timeout | No | `[5, "minutes"]` |
| `memory` | `InfoSize` | Memory limit | No | `[100, "kilo", "bytes"]` |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: on infosizes, would it be better to specify units like kb, ms, as that can be enumerated upon?

Copy link
Member Author

@expede expede Dec 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha everyone hates it when I do this, but I keep trying in every project 😆

can be enumerated upon?

It is enumerated! Just composable, so we don't need kB, MB, TB, PB, ns, us, ms, etc. I guess it's not a huge deal either way. We could also set these fields to string.

If you still dislike it I can pull it out.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. no biggie. I've been used to systems like prom, etc, which have:

Units supported: B, KB, MB, GB, TB, PB, EB. Ex: "512MB". Based on powers-of-2, so 1KB is 1024B

So, are people more used to that, maybe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — these kinds of libraries are very common and pretty much what people expect. Ugh, stringly typed things that I have to parse a bunch of different ways (whitespace etc) is gross 😆

100% of the time that I've done this someone has mentioned it, so clearly it's against the principle of least surprise. Will revert!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love it, please keep it! 😅

README.md Outdated
| Field | Type | Description | Required |
|-----------------|------------|-------------------------------------------------------------------------|----------|
| `ipvm/workflow` | `Workflow` | IPVM Workflow | Yes |
| `signature` | `VarSig` | [VarSig](https://github.com/ChainAgnostic/varsig/) of serialized fields | Yes |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the repo is still WIP for VarSig? Can we get a footnote about it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're trying to retcon varsigs a bit by adding more data to them. I hope to get a spec out in the next day or two.

|------------|------------------------------|------------------------------------------------------------------------|----------|---------|
| `v` | `"0.1.0"` | IPVM workflow version | Yes | |
| `meta` | `{String : Any}` | User-defined object (tags, comments, etc) | No | `{}` |
| `parent` | `[&Workflow, Label] or Null` | The workflow & task label that initiated the current workflow (if any) | No | `Null` |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minus general provenance/lineage, @expede how do you see parent being used in an implementation of workflow(s)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entirely for provenance & tracing

README.md Show resolved Hide resolved
| `config` | `Config` | Global configuration (e.g. timeout for the entire workflow) | No | `{}` |
| `defaults` | `Config` | Individual task config defaults | No | `{}` |
| `tasks` | `UCAN.Invocation` | UCAN Invocation | Yes | |
| `catch` | `&WasmTask` | Deterministic Wasm that fires on exceptions | No | `{}` |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is wasmtask a ucan/invocation task, as it defaults to a {}? I ask because elsewhere in the doc, it looks like a CID directly. I wonder if catch points to a CID or wasm::CID and/or to a known top-level key/name of an invocation task that may already have been introduced previously?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, here catch is wasm + determinism-specific, so it should not be allowed on the workflow top-level if an inner task is a docker one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it defaults to a {}

Oops! That should definitely default to Null. Will fix!

a known top-level key/name of an invocation task

I want to keep it in a separate block for a couple reasons:

  • Pure Wasm only
  • Set really tight, enforced resource restrictions (it should terminate quickly)
  • If it's a named task, it will get scheduled with the rest of the workflow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to tighten up some of these, too. That really should be &Wasm.


Note that while IPVM MUST treat the pure tasks together as transactional. It is not possible to roll back any destructive effects that have already been run. As such, it is RECOMMENDED to have few (if any) tasks depend on the output of a destructive effect, so they can be scheduled at the end of the workflow.

# 6 Receipt Output

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'd love to see an example of the receipt output in json and maybe linked through with a workflow example. It'd also be great to link the result CID w/ label to another, new workflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! Per our conversation after you wrote this comment, we'll put together an examples directory 👍


IPVM workflows are built on top of cryptographic capabilities, providing a strong basis for distributed computation in trustless networks. This even provides a clear basis for crossing Web 2.0 and Web3 systems, other computation networks, local operation (in full or part).

The IPVM Workflow spec extends on several other specs that have been developed to provide this basis:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the frontmatter still uses job (i.e. https://github.com/ipvm-wg/spec), but it may be good to highlight why we're using workflow > job, especially when other systems like Bacalhau use job as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the frontmatter still uses job (i.e. https://github.com/ipvm-wg/spec)

Great point! It will not use the term "job" in the long-term. Consider that old text, which will get updated independently.


If the `msg` branch is returned, the invocation MUST immediately rethrow with the update message.

Note that while IPVM MUST treat the pure tasks together as transactional. It is not possible to roll back any destructive effects that have already been run. As such, it is RECOMMENDED to have few (if any) tasks depend on the output of a destructive effect, so they can be scheduled at the end of the workflow.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any thoughts on how to enforce this in any way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about destructive actions almost like a cut in Prolog. You lose transactional guarantees as soon as you use them. We may want to include a flag or something to set your expected properties like transactional, but that's also probably a big ask of most devs. Handling this in tooling with e.g. a warning like "hey, just so you know, we won't be able to roll back prior to this email if something goes wrong"

| `out` | `{String: Any}` | The results of each call, the task's label. MAY contain sub-receipts. | Yes | |
| `meta` | `Any` | Non-normative extended fields | No | `null` |

If the `catch` field is set on the outer `Workflow`, The `out` field MAY include the output under the `ipvm/catch` key

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we have a need to actively want to throw an exception based on the execution of a task?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also covered this on the call after this was posted. For posterity:

I've been thinking about this as akin to stdout and stderr. If you run an HTTP PUT, and it fails, the runtime needs to return that as a Failure. If a Docker container exits with a nonzero exit code, we'll capture its stderr and put that in the Failure struct.

Good point: this needs to be in the text 👍

>
> J. Paul Morrison, [Flow-Based Programming](https://jpaulm.github.io/fbp/book.html)

The potential complexity of a fully distributed execution by untrusted peers is very high. IPVM Workflows reduce the number of possible states by forcing explicit handling of any dangerous effects. The IPVM Workflow spec is a declarative document that MAY be inspected, transmitted, logged, and negotiated. Unlike s systems like WASI, there is a strict separation of effects from pure data, an emphasis on verifiability, and [promise pipelining](http://erights.org/elib/distrib/pipeline.html).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: are effects still getting their own write-up?

Copy link
Member Author

@expede expede Dec 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For those following along at home: Zeeshan and I chatted about this live)

They are, absolutely! I'm trying to keep the various specs as separate as possible; my current hypothesis is that we're going to do the scheduler safety properties for URI/Ability pairs as part of service discovery. They'll need to register something like:

Service
  { uri_scheme        = "https"
  , ability           = "crud/read"
  , idempotent        = False
  , destructive       = False
  , max_verifiability = Attestation
  }

We know that everything is some kind of effect unless it's raw Wasm, so that makes it really easy for the runtime to detect. If you want to run e.g. a pure subset of Python, you can definitely get an interpreter written in Wasm to feed the source into. We could open up pure: Bool as well, but the tooling for Wasm is advancing so quickly that I currently don't believe that we'll need anything beyond "run the source inside Wasm"

expede and others added 2 commits December 2, 2022 00:45

``` js
{
"some-wasm": {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that these are shown in {String:Task} form – would it be clearer to just show the Task, to avoid confusion that the string is part of that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants