-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CIP-0139? | Plutus Core Builtin Type - ByteStringMap #921
CIP-0139? | Plutus Core Builtin Type - ByteStringMap #921
Conversation
CIP-bytestring-map-builtin/README.md
Outdated
- We consider the costing size of a `ByteStringMap` to be equal to the size of its corresponding representation as an association list. | ||
- The binary encoding of `ByteStringMap`s is equivalent to its encoding as an association list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kwxm - please review. This is based on the standard flat
implementation for Haskell's Data.Map
.
CIP-bytestring-map-builtin/README.md
Outdated
3. `deleteBSMap :: forall v . ByteString -> ByteStringMap v -> ByteStringMap v` | ||
- it returns a map without the key-value association identified by the input key | ||
- it uses logarithmic time in the `ByteStringMap` argument, linear time in the `ByteString` argument and logarithmic memory in the `ByteStringMap` argument | ||
4. `unionBSMap :: forall v . ByteStringMap v -> ByteStringMap v -> ByteStringMap v` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most important use-case for this builtin data-structure is to facilitate fast operations over the Value
type defined as Map.Map CurrencySymbol (Map.Map TokenName Integer)
. These operations are by far one of the biggest performance bottlenecks for onchain code.
It is extremely expensive to apply validation logic to Value
s in onchain code because of the high cost of map operations. union
is by far the most expensive value operation, but it also used nearly universally by dApps and is a huge performance bottleneck. This builtin data-structure needs to have a semigroup collision handling for Values
either unionBSMap
with some bool flag indicating a value or just a separate unionBSMapValue
builtin for use with Value
types.
The common pattern shared across most protocols is as follows:
- Given an input, construct the value expected to be possessed by the output which seeks to satisfy the spending conditions of said input.
- Check that the expected output value matches the actual output value
The construction of the expected output value is in the majority of cases a union of the input value and a value delta (the quantities of
assets to be added to / removed from the input, ie. the difference in value between the input UTxO and the output UTxO).
processRequest :: ProtocolStateDatum -> TxInInfo -> TxOut -> Bool
processRequest stateDatum txIn txOut =
let resolvedInput = (txInInfoResolved txIn)
inputValue = txOutValue resolvedInput
outputValue = txOutValue txOut
-- Given a tx input and the relevant state datum compute the value delta between the input value
-- and the expected value delta between the input value and the expected. (ie the quantities of
-- assets to be added to / removed from the input.
inputValueDelta = computeValueDelta txIn stateDatum
expectedOutputValue = inputValue <> inputValueDelta
in expectedOutputValue == outputValue
For examples in production dApps see:
https://github.com/SundaeSwap-finance/sundae-contracts/blob/edc118880d3baffcb7d5bd277faec2e7dc54c59b/lib/calculation/swap.ak#L120C1-L124C64
https://github.com/minswap/minswap-dex-v2/blob/c3293ad435792ba502f5a2b1691c6736e5f167e6/lib/amm_dex_v2/order_validation.ak#L530
https://github.com/minswap/minswap-dex-v2/blob/c3293ad435792ba502f5a2b1691c6736e5f167e6/lib/amm_dex_v2/order_validation.ak#L136
https://github.com/lenfiLabs/lenfi-smart-contracts/blob/eb7dc685f3888c940d0da027c46b5f1c3d9578f1/validators/order_contract.ak#L164
https://github.com/minswap/minswap-lbe-v2/blob/6fea91d881d6037e9c944ba39f6068a1d6df4e42/lib/lb_v2/validation.ak#L429
https://github.com/minswap/minswap-lbe-v2/blob/6fea91d881d6037e9c944ba39f6068a1d6df4e42/lib/lb_v2/order_validation.ak#L237
https://github.com/minswap/minswap-lbe-v2/blob/6fea91d881d6037e9c944ba39f6068a1d6df4e42/lib/lb_v2/order_validation.ak#L82
https://github.com/Danogo2023/bond-dex/blob/f29effc2ca1f354fd8a9c3695ec58926f940de84/lib/bond/limit_bid/withdraw.ak#L392-L397
For explanation on why many of these are using value.add
(insert with semigroup collision handling instead of left/right-based collision)
instead of merge see https://github.com/txpipe-shop/sundae-swap/blob/d8adea189f16b3a585ea311d6467f5fe2fda78d4/src/audit.typ#L1099-L1119
probably due to the overhead introduced by recursion in onchain code, merge is more expensive than add and thus should be preferred when you have a fixed number of assets to update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@colll78 Thanks for all the pointers! Sounds like we should have insertWithBSMap
as a builtin. With higher order function support coming soon, this makes total sense.
I'm unsure about the following:
- Should we keep
insertBSMap
- Do we need a
unionWithBSMap
builtin, or doesinsertWithBSMap
suffice - What can we do to speed up equality checks on Values? It's not simply map equality due to potential zero-value entries. Perhaps
filterBSMap
+equalsBSMap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Map equality is cheap because we currently take advantage of equalsData
.
equalsData (mergeValue ownInputValue tokensToAdd) ownOutputValue
To handle zero-value entries mergeValue
simply deletes them, I would expect that the union function would similarly be capable of deleting zero value entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do need unionWithBSMap
and I think filterBSMap
would be extremely helpful as-well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't have insertWithBSMap
or unionWithBSMap
. The upcoming higher-order builtins allow one to return an application from a builtin call, not to execute one within the builtin (we used to have that functionality but it was removed a long time ago) and we need the latter to be able to implement insertWithBSMap
or unionWithBSMap
as a builtin. We can try to design such functionality, but it's probably not gonna be pretty. I've spent some time several years ago thinking about roughly this problem and had an idea back then, so if you want me to try it out, I can do that (it'll be a few days of work before I can tell whether it's promising or not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then just a default mergeValue
and insertValue
(that you don't provide a function to, and only works on canonical ledger constructed Value
as BuiltinData
) would be just as impactful. Dealing with Values
is the main use-case and one of the main efficiency bottlenecks in most protocols, it makes sense to have specialized builtins for interacting with Values
given they are one of the two most foundational components of smart contracts (the purpose of smart contracts is to validate the transfer of and change in Value
and Data
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, perhaps one of the most universally beloved smart contract platforms in the blockchain space defines a programming paradigm "Asset oriented programming", the entire language has been built around the notion that it natively understands assets and provides native first-class features for assets. The introduction of such builtins would also vastly enhance the security of the ecosystem because the implementation of such functions in onchain code are extremely error prone, especially when you look at all the black magic optimization that libraries are using (which wouldn't be necessary with such builtins).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@colll78 sounds interesting. Wanna open an issue in the Plutus repo about that?
However I personally prefer to be as general as possible until we're sure that that's not enough. So I'd rather investigate unionWithBSMap
, maybe we can do that and then we can keep domain-specific logic in the user space. If that's too hard or isn't going to fly, then I agree with you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is reasonable.
mergeValue :: BuiltinData -> BuiltinData -> BuiltinData
mergeValue v1 v2 =
let valueMap1 = listToBSMap (BI.unsafeDataAsMap v1)
valueMap2 = listToBSMap (BI.unsafeDataAsMap v2)
combinedVal = unionWithBSMap (\innerA innerB -> unionWithBsMap BI.addInteger (listToBSMap (BI.unsafeDataAsMap innerA)) (listToBSMap (BI.unsafeDataAsMap innerB))) valueMap1 valueMap2
in toBuiltinData $ filterWithBSMap ((/=) 0) combinedVal
In this case filterWithBSMap
and equalsBSMap
are definitely required.
CIP-bytestring-map-builtin/README.md
Outdated
- it has `O(n * logn)` time complexity, where `n` is the size of the list | ||
- it has linear space complexity | ||
- if the input list does not contain duplicate keys and is ordered (in ascending order of the keys), the space and time complexity will be linear | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following should also be added:
keysBSMap :: forall v . BuiltinByteStringMap v -> BuiltinList ByteString
valuesBSMap :: forall v. BuiltinByteStringMap v -> BuiltinList v
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very well-written, thank you.
Sorry for being a party pooper on insertWithBSMap
/unionWithBSMap
CIP-bytestring-map-builtin/README.md
Outdated
|
||
Why are higher-order builtin functions necessary? Map keys must be processed (or simply, just compared) in order to be able to build and operate over well-defined maps. For arbitrary key types, that would only be possible by passing the required processing or comparison functions as arguments to the builtin functions. | ||
|
||
The only way to implement a builtin map with a (builtin) polymorphic key type would be to modify the builtins language itself. A compiler would have to only allow the construction of maps with key types which are known a priori to posess some property (such as a total ordering). The Plutus Core evaluator would have to check for this as well. The implementation itself would not be too complicated, but we must first consider how this would change the formal semantics of the builtins language. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed that and we don't really need to modify anything much? We can just fail at runtime for built-in types that cannot implement Ord
(or PlutusOrd
or whatever). And then we'll expose a safe interface in Plinth forbidding one from using such built-in types as keys in maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment still stands, I don't believe you'd need to pass a comparison function around, keys can only be of built-in types and we either know how to compare those (in most cases) or don't (like those pesky crypto types) and in the latter case we can just throw. UPLC is a low-level language, it's fine for it to throw like that, and high-level languages will be able to provide a type-safe interface. No meta-theoretical properties are hurt. General maps are gonna be a bit less efficient though, but if done right, probably insignificantly so.
CIP-bytestring-map-builtin/README.md
Outdated
We propose the addition of the following builtin type: `ByteStringMap` of kind `Type -> Type`. | ||
- It represents a map associating keys of type `ByteString` with values of any builtin type. | ||
- The new builtin type should be implemented using a tree data structure allowing for logarithmic-time map operations. | ||
- We consider the costing size of a `ByteStringMap` to be equal to the size of its corresponding representation as an association list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about that, we'll have to carry some additional info around for balancing purposes, so it sounds like costing should reflect that. And an element in an association list is a pointer to a pair with two pointers to actual content, in our case it'll probably be just the latter two pointers. So yeah, I don't think we should assume that costing is going to be exactly the same, although I suppose that could be a good enough approximation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kwxm what should we say here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could simply say "roughly equal" instead of "equal" and that'll be fine, this is all to be figured out during the costing phase anyway, it's fine not to have the exact answer now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kwxm what should we say here?
@ana-pantilie: sorry, I missed this. We actually have to caculate the size of the map object before we carry out any operations on it, and that might be expensive if we have to traverse a tree to findi ts size. One way to alleviate this might be to have the maps carry their sizes about with them and make sure that the update operations update the size as well (and during deserialisation we'd have to check that maps embedded in scripts don't lie about their size). We might even need two measures of size: one telling you the number of elements of the map (for converting to a list, for example), and one telling you the depth of the map (for insertion etc.). We could do this by using newtype wrappers to choose which size measure should be applied. Maybe we could just store the size of the object and take the log instead of measuring the depth, since these should be proportional if the tree's always kept balanced. This is all probably too detailed for a CIP though, so saying that the size should be something like the number of entries should be fine: we can work out the details later.
CIP-bytestring-map-builtin/README.md
Outdated
- We consider the costing size of a `ByteStringMap` to be equal to the size of its corresponding representation as an association list. | ||
- The binary encoding of `ByteStringMap`s is equivalent to its encoding as an association list. | ||
|
||
**Note**: Here `Type` is the universe of all builtin types, since we do not consider types formed out of applying builtin types to arbitrary types to be inhabited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be a type variable or a type application reducing to a built-in type etc, so it's not just the built-in types.
CIP-bytestring-map-builtin/README.md
Outdated
3. `deleteBSMap :: forall v . ByteString -> ByteStringMap v -> ByteStringMap v` | ||
- it returns a map without the key-value association identified by the input key | ||
- it uses logarithmic time in the `ByteStringMap` argument, linear time in the `ByteString` argument and logarithmic memory in the `ByteStringMap` argument | ||
4. `unionBSMap :: forall v . ByteStringMap v -> ByteStringMap v -> ByteStringMap v` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't have insertWithBSMap
or unionWithBSMap
. The upcoming higher-order builtins allow one to return an application from a builtin call, not to execute one within the builtin (we used to have that functionality but it was removed a long time ago) and we need the latter to be able to implement insertWithBSMap
or unionWithBSMap
as a builtin. We can try to design such functionality, but it's probably not gonna be pretty. I've spent some time several years ago thinking about roughly this problem and had an idea back then, so if you want me to try it out, I can do that (it'll be a few days of work before I can tell whether it's promising or not).
CIP-bytestring-map-builtin/README.md
Outdated
|
||
### Traversing `ByteStringMap`s | ||
|
||
Unfortunately, due to the limitations presented earlier we cannot provide an effective builtin fold over maps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do it we implemented Allow for returning application trees from builtins a.k.a. folding builtins, which is a straightforward generalization of pattern matching builtins and shouldn't be too hard to get done (famous last words).
Co-authored-by: effectfully <[email protected]>
Co-authored-by: Ryan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ana-pantilie - a well received pair of proposals in general (with #915) as confirmed at today's CIP meeting; please rename the containing directory accordingly & update the proposal link in the initial posting 🎉
@rphair Great news, thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments. Unfortunately, there's something that I didn't realize the first time and it's that listToBSMap
is unusable, because at the moment it's not possible to construct arbitrary lists of pairs, only lists of pairs of Data
. Neither the specification nor the formalization have this issue, but there's been zero work done to address it on the implementation side. And it is very non-trivial if we want to do it properly.
|
||
The only remaining solution is to consider introducing a builtin map type. | ||
|
||
Unfortunately, the builtin language has a significant limitation: it does not allow for higher-order builtin functions. As of the time of writing, builtin functions may not receive other functions as arguments, but there is ongoing work to lift this restriction. However, due to complex design considerations, repeated applications of function arguments will incur high performance costs. Since maps are recursive data structures, these high performance costs will be accumulated and such an approach will still not be feasible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of the time of writing, builtin functions may not receive other functions as arguments, but there is ongoing work to lift this restriction.
The restriction has been lifted, so this builtin:
lookupBSMap :: forall v . ByteString -> (() -> v) -> ByteStringMap v -> v
is now expressible. BTW, the same logic as in the array CIP applies, so it should probably be
lookupBSMap :: forall v . (() -> v) -> ByteStringMap v -> ByteString -> v
We propose the following set of builtin functions to accompany the new builtin type: | ||
1. `insertBSMap :: forall v . ByteString -> v -> ByteStringMap v -> ByteStringMap v` | ||
- it returns a map with the input key updated to the new value, silently discarding any previous value | ||
- it uses logarithmic time in the `ByteStringMap` argument, linear time in the `ByteString` argument, and logarithmic memory in the `ByteStringMap` argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please talk about insertWith
/unionWith
a bit? Specifically, that we want to have them, but cannot unless one of
is implemented.
3. `deleteBSMap :: forall v . ByteString -> ByteStringMap v -> ByteStringMap v` | ||
- it returns a map without the key-value association identified by the input key | ||
- it uses logarithmic time in the `ByteStringMap` argument, linear time in the `ByteString` argument and logarithmic memory in the `ByteStringMap` argument | ||
4. `unionBSMap :: forall v . ByteStringMap v -> ByteStringMap v -> ByteStringMap v` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we want intersectionBSMap
and differenceBSMap
then.
A horrible nit btw: I'd personally prefer a different capitalization: unionBsMap
. But feel free to ignore.
- it transforms the map into an association list | ||
- the resulting list should be ordered (in ascending order of the keys) | ||
- it is linear in the size of the `ByteStringMap` | ||
6. `listToBSMap :: forall v . List (Pair ByteString v) -> ByteStringMap v` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this has a big problem. I.e. at the moment this builtin is definable but basically unusable. Please reference that issue here.
|
||
`insertBSMap` and `deleteBSMap` remain fundamental operations which we must expose as builtin functions. | ||
|
||
One seemingly unnecessary addition is `unionBSMap`, as it may be implemented in Plutus Core by traversing the elements of one of the maps and calling `insertBSMap`. However, as outlined by [CPS-0013][1], `ByteStringMap` should provide an optimal implementation of the Plutus Tx `Value` type, and the operations on `Value`s depend heavily on map unions. Therefore, for performance reasons, we prefer to expose this functionality as a builtin function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt that Map
-backed Value
is going to happen, it's way too much complexity just to get a slow version of a hypothetical built-in Value
.
We propose the following set of builtin functions to accompany the new builtin type: | ||
1. `insertBSMap :: forall v . ByteString -> v -> ByteStringMap v -> ByteStringMap v` | ||
- it returns a map with the input key updated to the new value, silently discarding any previous value | ||
- it uses logarithmic time in the `ByteStringMap` argument, linear time in the `ByteString` argument, and logarithmic memory in the `ByteStringMap` argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the memory usage relly be logarithmic in the size of the map? In most functional data structures adding a new entry just requires a constant amount of space. Maybe it's not that simple if you need to keep your structure balanced though: I should go and look up how you actually implement these things.
|
||
We propose the addition of the following builtin type: `ByteStringMap` of kind `Type -> Type`. | ||
- It represents a map associating keys of type `ByteString` with values of any builtin type. | ||
- The new builtin type should be implemented using a tree data structure allowing for logarithmic-time map operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we have to implement all of this ourselves, or is there some existing library that we can use?
After discussing internally, I've unfortunately decided to close this CIP. The reason behind this decision is that we have discovered that it will not actually solve the original problem: introducing a builtin type for representing @colll78 and @effectfully believe that all the overhead required to implement the To make matters worse, the limitation regarding Whether a builtin map could ever be useful, @colll78 believes a more general builtin To conclude, the language isn't ready yet for a builtin I'm glad that at least this CIP has generated discussions which have made that clear and we have a better idea of what to try next. @rphair thank you for your time considering and reviewing this CIP. |
Thank you as well @ana-pantilie for your honesty & comprehensive documentation, along with your reviewers, of this problem space. ⭐ |
We propose a
ByteStringMap
type for Plutus Core. This type will have logarithmic time lookups, which are difficult to achieve in Plutus Core programs today.(Rendered Version)