Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify/fix expiration bump logic in host. #957

Merged
merged 7 commits into from
Jul 28, 2023
Merged

Conversation

dmkozh
Copy link
Contributor

@dmkozh dmkozh commented Jul 19, 2023

What

  • Store bumps directly in the storage. This addresses several issues with the previous bump implementation, namely rollbacks (these weren't supported at all) and consistency with storage (previously it was possible to bump an existing entry).
  • Move autobump logic to host
  • Make inclusion/exclusion logic for current ledger consistent when computing expiration ledger (only include it for the new entries)
  • Add bump_contract_instance_and_code host fn to allow bumping arbitrary contracts - useful for factory contracts

Why

Bugfixes and closing feature gaps.

Known limitations

N/A

- Store bumps directly in the storage. This addresses several issues with the previous bump implementation, namely rollbacks (these weren't supported at all) and consistency with storage (previously it was possible to bump an existing entry).
- Move autobump logic to host
- Make inclusion/exclusion logic for current ledger consistent when computing expiration ledger (only include it for the new entries)
- Add `bump_contract_instance_and_code` host fn to allow bumping arbitrary contracts - useful for factory contracts

For simplicity we assume max expiration is correct for newly-created entries (these are managed exclusively by the host, so we can trust them to not exceed the limits).
Copy link
Contributor

@graydon graydon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits and questions. I think it's basically ok in approach, though I'm nervous about the "modifying read-only entries and checking for a subset of the changes later" part; somehow I convinced myself before that this approach literally can't work safely, but I'm not finding any record of how, so that's too vague an objection to block this on.

(I also thought the current behaviour wrt. rollbacks or lack thereof was intentional -- can you check with @jayz22 on that?)

soroban-env-host/src/host.rs Outdated Show resolved Hide resolved
soroban-env-host/src/host/frame.rs Outdated Show resolved Hide resolved
soroban-env-host/src/auth.rs Show resolved Hide resolved
soroban-env-host/src/storage.rs Outdated Show resolved Hide resolved
@dmkozh
Copy link
Contributor Author

dmkozh commented Jul 26, 2023

though I'm nervous about the "modifying read-only entries and checking for a subset of the changes later"

We could do a parallel map of read-only bumps and store it separately in the Storage struct. It will make code a bit more complicated, but we can go for that just in case. RW entries should be bumped in-place though to keep the logic sane (again, bump-delete-insert example).

(I also thought the current behaviour wrt. rollbacks or lack thereof was intentional -- can you check with @jayz22 on that?)

I don't see how lack of rollbacks can be intentional. Performing bumps from e.g. failed contract calls doesn't really make any sense.

@graydon
Copy link
Contributor

graydon commented Jul 27, 2023

We could do a parallel map of read-only bumps and store it separately in the Storage struct. It will make code a bit more complicated, but we can go for that just in case. RW entries should be bumped in-place though to keep the logic sane (again, bump-delete-insert example).

So .. I recovered the conversation in which I talked this logic through before, I was mis-remembering, it was with @sisuresh not @jay and it had to do with read-write dependencies and parallelism.

The thing is, if you have two threads that both do a bump on a read-only entry, they really might execute in parallel. So you have to ensure that:

  • If A and B both bump X, you have a sensible model for "who to charge how much". Maybe they both get charged the amount they bumped and you write-back the max of both bumps? It has to be well defined.
  • In any case, once A bumps (or B bumps) X, it's imperative that neither should observe the post-bump expiry-time of X, because doing so makes the value they read-back dependent on the specific physical realization of logical partitions. Like a host that runs all logical partitions serially will read-back a different value than a host that runs all logical partitions in parallel. We do not want the logical-to-physical mapping to be observable that way.

I think this may have to do with why there were no rollbacks on bumps, though I'm not sure. I'll continue to try to figure that out.

I want to be clear though: it's essential that we not accidentally introduce any read-after-write data dependencies on values we're assuming to be "read-only" with this work. It's entirely possible for a mistake here to defeat the whole concurrency control protocol we have planned, serializing everything into a single thread. We need to be very careful. The code in the PR, as written, I think violates this condition. It reads expiry times before bumping them, which means if two txs run back to back on a single thread and they both "bump a read-only entry", they will be observing their co-situation on that thread; scheduling the two of them on two separate threads would produce different results. That's not ok. We need a design that does not have that characteristic.

@dmkozh
Copy link
Contributor Author

dmkozh commented Jul 27, 2023

If A and B both bump X, you have a sensible model for "who to charge how much". Maybe they both get charged the amount they bumped and you write-back the max of both bumps? It has to be well defined.

This is defined, but probably not in a discoverable place... Currently we charge bumps relative to the current ledger state (so if A bumps by 1000 and then B bumps by 1100, then A will be charged for bump by 1000 and B will be charged for bump by 100). When we have concurrent execution, everyone will be charged against the ledger snapshot before tx is executed - that's the only way to get deterministic fees. Note, that this logic has nothing to do with the host really.

it's imperative that neither should observe the post-bump expiry-time of X,

The final bump is not observable of course. But every host instance can (and should) observe the bump it has performed within the invocation.

Like a host that runs all logical partitions serially will read-back a different value than a host that runs all logical partitions in parallel.

I'm not sure what do you mean here. All the host instances are independent, aren't they? Nothing in the current design suggests running several host functions in parallel. Every invocation needs to have a separate host with separate storage and budget.

I think this may have to do with why there were no rollbacks on bumps, though I'm not sure. I'll continue to try to figure that out.

Rollbacks are about the logic of the current invocation. They should behave in an intuitive fashion and it shouldn't depend on the environment the host runs in, concurrency etc. From the contract perspective it is expected that try_call has no side effects on failure, besides the budget being charged.

The code in the PR, as written, I think violates this condition. It reads expiry times before bumping them, which means if two txs run back to back on a single thread and they both "bump a read-only entry", they will be observing their co-situation on that thread; scheduling the two of them on two separate threads would produce different results.

I think this comes back to the point above - it is surprising to me that you're talking about running several operations on the same host instance. That's not possible with the current design for much more reasons than just rent bumps. If this ever becomes a design goal, we'll need to change a lot about how host is implemented. This PR (and the followups) work under the assumption that every invocation is isolated within a host instance. That's what the current design definitely allows for. It's up to the embedder to provide the proper storage state that doesn't lead to non-determinism.

@dmkozh
Copy link
Contributor Author

dmkozh commented Jul 27, 2023

It's entirely possible for a mistake here to defeat the whole concurrency control protocol we have planned, serializing everything into a single thread

I also want to emphasize once more that this PR doesn't change the observable host behavior compared to the current implementation. It only addresses a few bugs (or at least non-intuitive behavior, depending on how you name it). Currently it's core that does autobumps and aggregates the duplicate bumps following exactly the same logic as has been implemented here.

@dmkozh
Copy link
Contributor Author

dmkozh commented Jul 27, 2023

it's essential that we not accidentally introduce any read-after-write data dependencies on values we're assuming to be "read-only" with this work.

Another thought: we definitely don't enforce any tx sequencing in any way in general. So it's possible that tx 1 modifies an entry, then tx 2 reads it etc. Clearly, we will need a protocol change for introducing the concurrent execution. It will also take care of bumps, but I really think these are actually much simpler than everything else - for bumps you just need to be sure to provide the initial expiration ledger of all the RO entries. RW-RO entry coordination would need some complex tx ordering changes on the other hand.

@graydon
Copy link
Contributor

graydon commented Jul 27, 2023

No, this PR changes semantics. It's very important that we get clear on expectations here because this can break concurrency.

The assumed/intended model is that a txset S will get broken into N logical partitions P1..PN and each logical partition has sequential semantics: if two transactions T1 and T2 in P1 both have key K in their RW footprint then T2 will observe T1's write to K. The logical partitions are driven by the footprints and encapsulate all such observable sequential data dependency, such that we may choose to schedule P1 and P2 on a single thread, or two separate threads, and the transactions in either partition cannot observe that scheduling decision. Creating observable read after write dependencies that span partitions (as a write to a RO footprint entry would do) breaks this model.

This has nothing to do with the lifecycle of a host. One transaction will still get one host instance. But changes to storage written by T1 are visible by T2. That has why there are RW entries in the footprint at all: to group together all the transactions with overlapping RW dependencies into a partition to give them a defined serialization order.

@graydon
Copy link
Contributor

graydon commented Jul 27, 2023

We are not going to make up a concurrency control protocol in a future protocol change. The system design already has a concurrency control protocol. It's why there are footprints. If you're not aware of how it works that's fine and I'm happy to keep discussing it until it's clear, but there's definitely already a design here.

@dmkozh
Copy link
Contributor Author

dmkozh commented Jul 27, 2023

Creating observable read after write dependencies that span partitions (as a write to a RO footprint entry would do) breaks this model.

There are no observable dependencies here (or to be more precise, preventing these has nothing to do with the host). The scheduler will need to make sure to provide host with the proper data snapshot, such that RO entry passed to P1 is the same as RO entry passed to P2. This is both true for the current implementation and updated implementation, it's just that the code was scattered around before. In any case, core will need to reconcile the bumps to pick the max one, but that will only happen after all transactions are applied. And that logic has nothing to do with the host logic. Host is just a black box that takes a ledger snapshot and produces some delta to it (which is represented in an explicit fashion in the followup PR). I don't think we should really care about what happens within the host itself - the host storage is not a direct abstraction of the ledger and it's not meant to be blindly written to the ledger.

@graydon
Copy link
Contributor

graydon commented Jul 28, 2023

Hm, ok, I .. can imagine what you're saying might work, apologies for overstating my case; but I am still a little concerned about the details.

I think there are .. maybe a few separate issues we need to work through on this subject, not all of which are directly caused-by or solvable-by this PR. I guess I just want to make sure this PR is not itself painting us further into a corner, and also make sure we're all on the same page about the corner we're currently already painted into with bumps, and maybe try to formulate a plan to fix it rather than making it worse:

  1. Currently, somewhat independent from this PR (though I suspect related to near-term core logic), it seems like we actually apply bumps immediately in core on a per-transaction basis. This means that a RO-entry bump in transaction T1 is observable in transaction T2 in the same txset. I think this is wrong and needs to be undone -- it needs to do what you say here, applying a max of all committed bumps per LE, per txset -- and if we leave it as-is we won't be able to parallelize separate partitions that share an RO dependency at all in the future. Which is the whole point of having RO dependencies.
  2. Assuming we fixed that, we would (I'd hope!) be in a position where RO bumps are not observable at all, by anyone, until the end of a txset. This would preserve the "RO"-ness of RO entries, keeping them consistent with the strict serial semantics we give every transaction, because despite being shared by multiple possibly-parallel partitions, they simply can't change. Constant data is trivially serially consistent, all orders of observation are equal! All good. Except then this PR comes along and adds a new weird third kind of semantics: not the effectively-serial-because-all-potential-parallelism-is-non-interfering RW semantics, not the effectively-serial-because-all-sharing-is-RO semantics, but serial-within-a-transaction-then-reset-to-previous-state-at-the-next-transaction. That is, if I am reading correctly, this PR allows an RO-bump on line N of a contract (in a single tx) to be observed on line N+1 of the same contract (by doing some action that might fail depending on current expiration time). While these semantics are not inherently going to interfere with parallelism (we're not talking about intra-tx parallelism ever) they seem .. at least inconsistent with the strict serial semantics we're aiming for everywhere else, both intra- and inter-transaction.

@dmkozh
Copy link
Contributor Author

dmkozh commented Jul 28, 2023

This means that a RO-entry bump in transaction T1 is observable in transaction T2 in the same txset.

Yes, this is what I referred to when I said that I don't change anything functionally. The implementation is also temporary as we don't do RO bumps properly at all. This change doesn't change anything however - core will still get the bumps in exactly the same fashion as before.

While these semantics are not inherently going to interfere with parallelism (we're not talking about intra-tx parallelism ever) they seem .. at least inconsistent with the strict serial semantics we're aiming for everywhere else, both intra- and inter-transaction.

I don't quite agree here. There is nothing wrong about these semantics from the contract standpoint. Bumps shouldn't be in any way special to the user. While cross-tx bumps indeed shouldn't be observable until the whole tx set is processed, from the tx standpoint they definitely do have to be observable, as otherwise we won't be able to come up with proper fees and meta. So every tx must be executed 'as if' it was the only tx that does bumps. Given that requirement, it doesn't really matter what we do in the host.

The only thing we need to do to ensure parallelism is to provide the proper ledger snapshot to the host. Otherwise, I might make the same statement about RW entries - if you try to execute two host instances that touch the same RW value in parallel, then you'll get inconsistent results. This shouldn't really be the host business to make these decisions.

by doing some action that might fail depending on current expiration time

Also, current expiration time is only necessary for the autobumps. Here we look at the max expiration, which is a function of the ledger sequence. If you're concerned about autobumps, we might as well get rid of them (though that would need a broader discussion). But again, given the proper snapshot, there is no issue as well: let's say tx A and tx B see an entry at expiration N and autobump to N+10. Both pay for 10 ledgers bump, then core reconciles this into N+10 expiration.

@graydon
Copy link
Contributor

graydon commented Jul 28, 2023

from the tx standpoint they definitely do have to be observable

"Observable" here can mean two things: one is "you can observe that a bump was issued that wants a delta of N" (which isn't a problem or even possible to prevent -- you can always just emit an event saying that you did it), the other is "you can observe the post-state of that delta, that current expiry is K" which I'm not ok with.

if you try to execute two host instances that touch the same RW value in parallel, then you'll get inconsistent results

But the whole point of entries being marked RW is that all txs sharing a given RW dependency get grouped into a sequentially-executed partition.

@dmkozh
Copy link
Contributor Author

dmkozh commented Jul 28, 2023

the other is "you can observe the post-state of that delta, that current expiry is K" which I'm not ok with.

I think we're on the same page here, and this change doesn't cause any observability changes. The output of the function call is 'this invocation wants to bump expiration of K from N to N + K'. It doesn't imply that it should be applied immediately.

But the whole point of entries being marked RW is that all txs sharing a given RW dependency get grouped into a sequentially-executed partition.

Sure, but what prevents us from defining the semantics for bumps (which would be 'RO bumps shouldn't immediately modify the ledger state'). This really has nothing to do with the host logic in the end. This implementation is just more convenient and less error-prone than the alternatives. We could have a separate map with bumps only that would need to care about the storage state in the same fashion. If you feel like that's somehow safer, then we can got for that as well, but really it doesn't change the final output and doesn't fix the issue of core not providing the proper snapshot for RO (which is being addressed separately).

@graydon
Copy link
Contributor

graydon commented Jul 28, 2023

Discussed in person, we'll revisit the issues around bump concurrency semantics later.

@graydon graydon merged commit 920b73d into stellar:main Jul 28, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants