From eaf8885989f299da40b68fd83fc36a43fb42b543 Mon Sep 17 00:00:00 2001 From: Dmytro Kozhevin Date: Wed, 23 Aug 2023 19:06:31 -0400 Subject: [PATCH 1/7] Add some links to Soroban auth CAP (#1385) --- core/cap-0046-11.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/core/cap-0046-11.md b/core/cap-0046-11.md index 54ebf43d5..1da140fbc 100644 --- a/core/cap-0046-11.md +++ b/core/cap-0046-11.md @@ -590,4 +590,16 @@ scope of the invoked contract. Built-in token contract uses authorization framework as well, and there is a potential attack surface for unauthorized access to the trustlines and Stellar account balances, but the authorization logic has to be in the host anyway, so -this CAP doesn't significantly change the risks for the built-in token. \ No newline at end of file +this CAP doesn't significantly change the risks for the built-in token. + +## Implementation + +[`auth.rs`](https://github.com/stellar/rs-soroban-env/blob/d92944576e2301c9866215efcdc4bbd24a5f3981/soroban-env-host/src/auth.rs) +file of Soroban host contains the implementation for the authorization framework. + +[`account_contract.rs`](https://github.com/stellar/rs-soroban-env/blob/d92944576e2301c9866215efcdc4bbd24a5f3981/soroban-env-host/src/native_contract/account_contract.rs) +file of Soroban host contains the implementation of Stellar account authentication as well as the harness for calling the custom contracts. + +## Test Cases + +[test/auth.rs](https://github.com/stellar/rs-soroban-env/blob/d92944576e2301c9866215efcdc4bbd24a5f3981/soroban-env-host/src/test/auth.rs) in Soroban host contains the comprehensive tests for various authorization scenarios. From 825b9e6764ae755c2a401289e491f54af8a1f9e6 Mon Sep 17 00:00:00 2001 From: Graydon Hoare Date: Thu, 24 Aug 2023 11:17:07 -0700 Subject: [PATCH 2/7] Update CAP 0046 and 0046-1 with past several months of changes. (#1377) * Update CAP 0046 and 0046-1 with past several months of changes. * Discuss versioning, add earlier link to host functions * address review comments --- core/cap-0046-01.md | 343 +++++++++++++++------- core/cap-0046.md | 693 ++++++++++++++++++++++++++------------------ 2 files changed, 649 insertions(+), 387 deletions(-) diff --git a/core/cap-0046-01.md b/core/cap-0046-01.md index 617657c6b..52d69bfb6 100644 --- a/core/cap-0046-01.md +++ b/core/cap-0046-01.md @@ -2,7 +2,7 @@ ``` CAP: 0046-01 (formerly 0046) -Title: WebAssembly Smart Contract Runtime Environment +Title: Soroban Runtime Environment Working Group: Owner: Graydon Hoare <@graydon> Authors: Graydon Hoare <@graydon> @@ -15,7 +15,7 @@ Protocol version: TBD ## Simple Summary -This CAP specifies the lowest-level **code execution** and **data model** aspects of a WebAssembly-based (WASM) "smart contract" system for the Stellar network. WASM smart contract code runs as a **guest** inside of a virtual machine (VM) which is embedded in a **host** environment. +This CAP specifies the lowest-level **code execution** and **data model** aspects of a WebAssembly-based (WASM) "smart contract" system for the Stellar network, called Soroban. WASM smart contract code runs as a **guest** inside of a virtual machine (VM) which is embedded in a **host** environment. Higher-level components of a smart contract system such as ledger entries, host objects and host functions, and transactions to manage and invoke contracts will be specified in additional CAPs. This CAP focuses only on the lowest-level components. @@ -44,30 +44,30 @@ The primary requirement for any smart contract system is to enable, within certa While the primary requirements seem simple enough to meet -- "just add a VM" -- there are many risks associated with a naive implementation. Therefore subsequent requirements take the form of parameters that constrain implementations in order to mitigate risks, including: - 3. Secure: the smart contract system should be secure against benign or malicious smart contract code as well as contract-code input that could imperil system availability, integrity, or confidentiality (in the few cases where secret data exists). In particular at the level of this CAP, the design should guard against: + 3. Secure: Soroban should be secure against benign or malicious smart contract code as well as contract-code input that could imperil system availability, integrity, or confidentiality (in the few cases where secret data exists). In particular at the level of this CAP, the design should guard against: - The risk of resource exhaustion, leading to denial of service by validators. - The risk of VM escape, leading to arbitrary Byzantine failures on validators, including data corruption or unauthorized transactions. - The risk of side channels, allowing VM code to extract validator private keys or other secret data on validators. - The risk of unintended contract behaviour due to invocation with malicious input data. - The risk of unintended contract behaviour due to calls to or from malicious contracts. - 4. Well-defined: the smart contract system should not compromise the network's bit-precise consensus or historical replay functions, and should have a well-defined and unambiguous semantics for any code or data added by users. Where possible this should be maintained by reference to existing, well-defined standards. In particular at the level of this CAP, the design should guard against: + 4. Well-defined: Soroban should not compromise the network's bit-precise consensus or historical replay functions, and should have a well-defined and unambiguous semantics for any code or data added by users. Where possible this should be maintained by reference to existing, well-defined standards. In particular at the level of this CAP, the design should guard against: - The risk of underspecified or nondeterministic VM code. - The risk of underspecified or nondeterministic datatypes. - 5. Performant: the smart contract system should not compromise the performance of the network, and should perform competitively with other smart contract systems. Users should not be subject to a significant performance penalty for using smart contracts instead of built-in transactions. In particular at the level of this CAP, the design should guard against: + 5. Performant: Soroban should not compromise the performance of the network, and should perform competitively with other smart contract systems. Users should not be subject to a significant performance penalty for using smart contracts instead of built-in transactions. In particular at the level of this CAP, the design should guard against: - The risk of needing to load, compile, instantiate or run a large amount of VM code per transaction. Contracts should be small. - The risk of contending on shared mutable data that may defeat parallel execution of transactions. Contracts should be isolated. - The risk of requiring smart contract developers to do extensive optimization to achieve acceptable performance. - 6. Interoperable: the smart contract system will necessarily introduce _some_ new user-defined semantics which are by definition unknown to _some_ users and 3rd parties. But beyond such _necessary_ risks, the smart contract system should avoid introducing _unnecessary_ hazards to interoperability, especially through choice of data encoding for input, output and persistent state. In particular at the level of this CAP, the design should guard against: + 6. Interoperable: Soroban will necessarily introduce _some_ new user-defined semantics which are by definition unknown to _some_ users and 3rd parties. But beyond such _necessary_ risks, Soroban should avoid introducing _unnecessary_ hazards to interoperability, especially through choice of data encoding for input, output and persistent state. In particular at the level of this CAP, the design should guard against: - The risk of being unable to share data between different contracts, or different versions of the same contract. - The risk of being forced to write contracts in, or invoke contracts from, a single programming language. - The risk of having no tools or only immature tools for working with any programming language targeting the VM. - The risk of being unable to passively observe contract state for testing, debugging, diagnosis or monitoring. - The risk of 3rd parties being unable to exchange data with contracts. - 7. Simple: the smart contract system should be as simple as possible while achieving other requirements. It should not require excessive innovation or expensive engineering by either developers or users of stellar-core. Smart contracts are late in coming to the Stellar Network, there is plenty of prior art to draw from, and there is a limited window of time to complete the work. At the level of this CAP, the design should guard against: + 7. Simple: Soroban should be as simple as possible while achieving other requirements. It should not require excessive innovation or expensive engineering by either developers or users of stellar-core. Smart contracts are late in coming to the Stellar Network, there is plenty of prior art to draw from, and there is a limited window of time to complete the work. At the level of this CAP, the design should guard against: - The risk of designing or implementing a novel VM, programming language, client library, or serialization format. - The risk of selecting an existing platform that is incompatible with or causes major changes to stellar-core. - The risk of delivering a system that is too challenging to learn for users or 3rd parties. @@ -91,7 +91,7 @@ This CAP specifies aspects of two separate but related contexts: - The **host** context: this consists of portions of the existing C++ code making up stellar-core that can be accessed by smart contracts, as well as some new C++ and Rust code implied by this CAP. New C++ and Rust code includes the implementation of a WebAssembly (WASM) virtual machine, a set of host objects, and a host environment that contains and manages the lifecycle and interaction of the host objects and virtual machines. The host environment, like the rest of stellar-core, is compiled to native code and runs with full access to its enclosing operating system environment, the ledger, the network, etc. The term "host environment" here corresponds to the term with that name in the WebAssembly specification. - - The **guest** context: this consists of WASM code _interpreted by_ a WASM virtual machine embedded in the host environment. Guest code may originate in any programming language able to target WASM, and will be provided by means unspecified in this CAP. Guest code has very limited access to its enclosing host environment: it can only consume CPU and memory resources to the extent that the host environment permits, and it can only call host functions that the host environment explicitly provides access to. The purpose of the guest context is to act as a so-called "sandbox" to attenuate potential harms caused by erroneous or malicious guest code, while allowing "just enough" programmability to satisfy the needs of users. + - The **guest** context: this consists of WASM code _executed by_ a WASM virtual machine embedded in the host environment. Guest code may originate in any programming language able to target WASM, and will be provided by means unspecified in this CAP. Guest code has very limited access to its enclosing host environment: it can only consume CPU and memory resources to the extent that the host environment permits, and it can only call host functions that the host environment explicitly provides access to. The purpose of the guest context is to act as a so-called "sandbox" to attenuate potential harms caused by erroneous or malicious guest code, while allowing "just enough" programmability to satisfy the needs of users. ### Components @@ -112,16 +112,17 @@ A new structure called a **host environment** is added to the transaction-proces - Any host objects that guest code in a WASM VM can refer to. - Any resource-accounting mechanisms for guest code. - Any host functions that guest code in a WASM VM can import. + - A set of in-memory XDR values called "storage", representing a portion of the ledger. #### Interface -The **interface** between the host environment and guest code is very narrow and is defined by the WASM specification of embedding. A summary of some relevant aspects is repeated here: +The **interface** between the host environment and guest code is very narrow and is defined as a subset of the WASM specification of "embedding". A summary of some relevant aspects is repeated here: - Guest memory ("WASM linear memory") is separated from host memory. The host may have a mechanism to access guest memory, but **the guest has no mechanism to access host memory**. - - There are **exactly 4 types of data values** shared between guest and host: i32, i64, f32, and f64. These are 32 and 64-bit 2s complement integers (with undefined "signedness") and 32 and 64-bit IEEE754 binary floating point values. + - WASM itself supports only 4 types of data value: `i32`, `i64`, `f32`, and `f64`. To further simplify the interface, we restrict it to support **exactly one type of data value**: `i64`. Everything Soroban passes back and forth between guest and host is encoded in one or more `i64` values. The the bits comprising such an `i64` may be interpreted in one of 3 ways depending on context: as a signed 64-bit 2s complement integer, as an unsigned 64-bit 2s complement integer, or as a polymorphic **value** type as described below in the "Data Model" section. - - Guest code modules carry a list of **exported** functions (that the guest provides and the host can call) and a list of **imported** functions (that the host provides and the guest can call). Both imported and exported functions can only pass a sequence of parameters of the 4 shared data types and return a single value of the 4 shared data types, or a trap. + - Guest code modules carry a list of **exported** functions (that the guest provides and the host can call) and a list of **imported** "host functions" (that the host provides and the guest can call). Both imported and exported functions can only pass a sequence of `i64` parameters and return a single `i64` value, or a trap. The set of host functions available for import is detailed in [CAP-0046-03 - Smart Contract Host Functions](./cap-0046-03.md). - Various error conditions may result in a guest **trap** condition, which is a terminal state for the WASM VM running the guest code: no further VM execution can occur after it traps. A trap may be generated by guest code due to an execution error, or may be generated by a host function called from guest code. Therefore any call from guest to host or host to guest may produce a trap result rather than a value. @@ -131,28 +132,34 @@ A host environment has its own **lifecycle**: it is created before any of the ho When a host environment is created, it contains no host objects and no VMs. -Adding a WASM VM to a host environment involves passing WASM code through the 4 lifecycle phases in the WASM specification. If any phase fails, no further phases will be performed on the failed WASM VM. +Adding a WASM VM to a host environment involves passing WASM code through the 4 lifecycle phases in the WASM specification: decoding, validation, instantiation and invocation. If any phase fails, no further phases will be performed on the failed WASM VM. Multiple WASM VMs can coexist in a single host environment. The intention is that one host environment and one WASM VM will be created for an "outermost" invocation of a smart contract, and that "inner" contracts can be invoked by guest code calling a host function that constructs an additional VM and invokes a guest function in that new VM, within the same shared host environment. The specific mechanism of calling between contracts is not specified in this CAP. Multiple WASM VMs in the same host environment can refer to the same host objects: this is the mechanism for passing (immutable) information between different smart contracts. +#### Storage + +The host environment's **storage** is initialized with some set of XDR objects loaded from the ledger. The set of XDR objects to load is statically declared by the transaction that causes instantiation of the host environment. After execution, when a host environment is being finalized, the modified portion of the host's storage is written back to the ledger. Between initialization and finalization, storage exists only in the host environment's memory. For more details on the semantics of storage see [CAP 0046-05 Smart Contract Data](./cap-0046-05.md). + #### Limits -TBD. Implementation-defined **limits** will be specified here before finalization of the CAP. +The host maintains a per-transaction budget of CPU and memory resources, and as resources are consumed both by host functions and by the WASM VM execution steps, the budget is reduced until it is exhausted. If the budget is exhausted before transaction completion, the host will trap with an error. + +An important aspect of resource limiting is that it is performed against a _deterministic model_ of the computational budget -- with "model" costs incrementally deducted from the budget model by explicit calls placed throughout the host function and VM code -- rather than by measuring real computational resources (time or memory) consumed during execution. This is necessary to maintain deterministic execution: any resource exhaustion that might occur must occur exactly the same way, at exactly the same instant, on every node in the Stellar network processing a Soroban transaction. -Additional implementation-defined limits will be specified to restrict the consumption of host resources by guest code. In particular, a step-counter or "gas limit" will be imposed on the number of instructions executed by guest code. Additionally any computation, memory or IO resources consumed by host functions called by guest code will be accounted-for. Any guest code that exceeds limits will terminate with an error. +The detailed structure and logic for the budget is given in [CAP-0046-10 - Smart Contract Budget Metering](./cap-0046-10.md) #### Determinism Both guest code and any part of the host environment controlled by guest code must execute deterministically in response to inputs, and must be sufficiently well-specified that replaying historical guest code in an upgraded host environment (i.e. a new version of stellar-core) will produce observably-identical results. This includes the result of observable resource exhaustion within host-controlled CPU or memory limits, which implies the need for careful resource accounting on all guest-controlled actions. The WASM spec has [carefully limited nondeterminism](https://github.com/WebAssembly/design/blob/main/Nondeterminism.md) to a small set of cases, which we consider here: - - New features: no WASM features beyond the 1.0 spec are supported by the smart contract system. - - Threads: not supported by the smart contract system. + - New features: only minor, fully deterministic WASM features beyond the 1.0 spec are supported by Soroban. Specifically the `sign-ext` and `mutable-globals` extensions, which are commonly included as target features in high level language compilers (eg. both Rust and C/C++ compilers). + - Threads: not supported by Soroban. - NaN-related behaviour for floating point: all floating point code is prohibited. - SIMD-related behaviour: all SIMD extensions are prohibited. - - Environment-resource limit exhaustion: will be specified above. + - Environment-resource limit exhaustion: enforced through a deterministic budget model as discussed above. ### Data Model @@ -164,7 +171,9 @@ This CAP defines a **data model** shared between guest and host environments. It #### Immutability -Values and Objects are both **immutable**: they cannot be changed once created. Any operation on a host object that implies a modification of the object's state will allocate a new object with the modified state, and return a value that refers to the new object. Objects must therefore be relatively lightweight, and reuse shared substructures where possible. +Host objects are **immutable**: they cannot be changed once created. Any operation on a host object that implies a modification of the object's state will allocate a new object (with a new handle) containing the modified state, and return a value that refers to the new object by its new handle. Objects must therefore be relatively small. Objects are _not_ necessarily unique; two objects may be equal (in the sense of containing the same data) but have different handles. + +Values may also be considered "immutable" in some sense, but since they are typically machine primitives and any two equal values are indistinguishable, mutability or immutability is not a particularly meaningful concept for values. #### Forms @@ -177,101 +186,176 @@ The rationale for the two separate forms is given below, in the rationale sectio ### XDR changes -See the new XDR files in the Soroban overview CAP. +See the new XDR files in [CAP-0046 - Soroban overview](./cap-0046.md) for a complete listing. + +One XDR union type, and its variants, are worth discussing in this CAP: SCVal. + +#### SCVal + +`SCVal` is a new XDR type. Its name is short for **smart contract ("SC") value**. It is a _general, polymorphic type_ in the sense that it is a union with many possible cases: numbers, strings, booleans, maps, vectors, error codes, and several special cases. It exists because many subsystems of the smart contract system, as well as many smart contracts themselves, must often act on values of interest to contracts without knowing their specific types ahead of time. + +For example, the smart contract transaction invocation path must pass user-provided values to a contract and return values from a contract, and must do so generically without knowledge of the types of those values, so it accepts and returns `SCVal`s. Similarly the smart contract storage system allows loading and storing `SCVal`s in the ledger. And within a contract's own code, often some logic wishes to deal with values without knowing their precise type, such as forwarding values from one contract to another or extracting them from containers. + +`SCVal` is keyed by the enum `SCValType` which has 22 variants. They are described in comments in `Stellar-contract.x`. #### Host value type -The **host value type** is a 64-bit integer carrying a bit-packed disjoint union of several cases: +The **host value type** -- in the Rust host and SDK code this is simply called `Val` -- is a 64-bit integer carrying a bit-packed disjoint union of several cases, each identified by a different `Tag` value. - - The least-significant bit differentiates between two _primary_ cases: - - If it is 0, the remaining 63 bits encode a **positive signed 64-bit integer**. - - If it is 1, the remaining 63 bits encode a low 3-bit **tag** and a high 60-bit **body**. - - The 8 tag values define an interpretation of the body, from least-significant to most-significant bits: - - Tag 0: a **32-bit unsigned integer** followed by 30 zero bits. - - Tag 1: a **32-bit signed integer** followed by 30 zero bits. - - Tag 2: a **static** set of 60-bit values, of which the first 3 are **void** (0), **true** (1) and **false** (2). - - Tag 3: an **object reference** given by a 28-bit type code followed by a 32-bit handle. - - Tag 4: a **symbol** having 10 or less 6-bit character codes drawn from the character repertoire `[_0-9A-Za-z]`, with `_` assigned code 1 and trailing positions in the symbol filled with a zero code, and code positions starting at the least significant 6 bits of the body. - - Tag 5: a **bitset** consisting of 60 1-bit flags. - - Tag 6: a **status** value consisting of a 28-bit type code followed by a 32-bit status code. - - Tag 7: reserved for future use. +##### Bit-packed representation -Note that the tag numbers in the host value representation are _not_ identical to the `SCValType` enumeration values used in the `SCVal` union. For example `SCV_OBJECT` is `4` whereas the host object tag value is `3`. The difference arises from the fact that the host value type has a 2-level tagging scheme -- a 1-bit level followed by a 3-bit level -- whereas `SCValType` is has a single 32-bit level of tagging. +The low 8 bits of a `Val` are referred to as the **tag** and the remaining high 56 bits are referred to as the **body**. The tag's value determines the interpretation of the body. In some cases the body is itself further subdivided into 24 low bits, called the body's **minor component**, and 32 high bits, called the body's **major component**. + +In other words, a value schematically looks like one of the following two cases: + +``` +bit 64 56 48 40 32 24 16 8 0 + +-------+-------+-------+-------+-------+-------+-------+-------+ + | body | tag | + +-------+-------+-------+-------+-------+-------+-------+-------+ + + +bit 64 56 48 40 32 24 16 8 0 + +-------+-------+-------+-------+-------+-------+-------+-------+ + | major | minor | tag | + +-------+-------+-------+-------+-------+-------+-------+-------+ +``` + +When accessing the body, the bit pattern may be considered as either a signed or unsigned 64-bit value. If signed, the body is extracted by a signed (arithmetic) right shift, properly sign-extending from 56 to 64 bits any negative values stored in the body. Similarly the major component may be treated as a signed or unsigned 32-bit integer. The minor component is only ever treated as an unsigned 32-bit integer, and is zero-extended from 24 to 32 bits on access. + +##### Tag values + +The different cases of the XDR value type `SCVal` are differentiated by the XDR enum `SCValType`, which is subsequently encoded as `Tag`s in a `Val`, though the mapping is 1:N rather than 1:1. Specifically, for each 1 `SCVal` case (i.e. `SCValType` code) at the XDR level, there may be N (usually 1 or 2) different _refinements_ of that type as a specialized `Tag` case in the host value type, usually to enable a more compact representation when small special cases of `SCVal` are projected into host values. + +`Tag` values are organized in two contiguous blocks: + + - A low-valued block (initially between values 0 and 15 inclusive) that covers "small" `Val`s, where the entire semantic content of the `Val` is contained in its body. + - A high-valued block (initially between values 64 and 77 inclusive) that covers "object handle" values, where the body of the `Val` just carries an object handle in its "major" component. + +The two blocks are kept separate to enable an efficient single-comparison `Tag` test for all object handle values. The split between blocks happens at tag value 64 rather than 128 (as might be expected given the 8 bit range of `Tag`) so that all initially assigned tags are less than 127, which is the maximum size of a single WASM ULEB128 code unit (another minor space optimization). We anticipate the system will grow to support some additional tags in the future, but believe the available tag space will be sufficient to accommodate such growth. + +The specific `Tag` values are: + + - `Tag::False = 0`, a refinement of the `SCVal` case for `SCV_BOOL` encoding just boolean false. The body is zero. + - `Tag::True = 1`, a refinement of the `SCVal` case for `SCV_BOOL` encoding just boolean true. The body is zero. + - `Tag::Void = 2`, corresponding to the `SCVal` case for `SCV_VOID`. The body is zero. + - `Tag::Error = 3`, corresponding to the `SCVal` case for `SCV_ERROR`. The body takes the major/minor form: + - The minor component is an "error type", one of the values of the XDR enumeration `SCErrorType`. + - The major component is an "error code": + - If the "error type" is `SCE_CONTRACT`, the major component is the `uint32` error code in the `SCE_CONTRACT` case of `SCError`, a contract-defined error code with no specific meaning to the runtime. + - Otherwise the major component is the `SCErrorCode` value of the corresponding `SCE_*` case of `SCError`. + - `Tag::U32Val = 4`, corresponding to the `SCVal` case for `SCV_U32`. The major component carries an unsigned 32-bit integer. + - `Tag::I32Val = 5`, corresponding to the `SCVal` case for `SCV_I32`. The major component carries a signed 32-bit integer. + - `Tag::U64Small = 6`, a refinement of the `SCVal` case for `SCV_U64` for unsigned 64-bit integer values that are small enough to fit in the 56 bits of the `Val`'s body without data loss. Specifically those values in the range from `0` to `0x00ff_ffff_ffff_ffff` inclusive. + - `Tag::I64Small = 7`, a refinement of the `SCVal` case for `SCV_I64` for signed 64-bit integer values that are small enough to fit in the 56 bits of the `Val`'s body without data loss. Specifically those `int64` values in the range from `-36_028_797_018_963_968` to `36_028_797_018_963_967` inclusive. + - `Tag::TimepointSmall = 8`, the same as `U64Small` but for the `SCVal` case for `SCV_TIMEPOINT`. + - `Tag::DurationSmall = 9`, the same as `U64Small` but for the `SCVal` case for `SCV_DURATION`. + - `Tag::U128Small = 10`, the same as `U64Small` but for the `SCVal` case for `SCV_U128`. + - `Tag::I128Small = 11`, the same as `I64Small` but for the `SCVal` case for `SCV_I128`. + - `Tag::U256Small = 12`, the same as `U64Small` but for the `SCVal` case for `SCV_U256`. + - `Tag::I256Small = 13`, the same as `I64Small` but for the `SCVal` case for `SCV_I256`. + - `Tag::SymbolSmall = 14`, a refinement of the `SCVal` case for `SCV_SYMBOL` for small symbols up to 9 characters long. The body of the `Val` contains between 0 and 9 characters, with each character encoded as a 6-bit, 1-based code that indexes into the 63-character repertoire allowed by the general `SCV_SYMBOL` type: `[_0-9-A-Za-z]`. That is, the character `_` is coded by the six bits `0b00_0001`, the character `0` is coded by the six bits `0b00_0010`, and so on, with the final allowed character `z` coded by the six bits `0b11_1111`. Then these 6-bit codes are packed into the 56 bit body such that the lowest 6 bits of the body always code for the last character in the symbol, and if the symbol is less than 9 characters long then the body's _high bits_ are padded with all-zero 6-bit codes (this representation optimizes for encoding in WASM's ULEB128 format). + - `Tag::LedgerKeyContractInstance = 15`, a refinement of the `SCVal` case for `SCV_LEDGER_KEY_CONTRACT_INSTANCE`, a special value reserved for use as a key identifying contract instances in the storage system. The body is zero. + - `Tag::U64Object = 64`, for object-handle `Val`s referring to the `SCVal` case for `SCV_U64`, typically only used when the `uint64` is larger than 56 bits and so cannot fit in a `U64Small`, though small integers stored in `U64Object` are legal. The body's major component is a 32-bit object handle, referring to a host object. The minor component is zero. + - `Tag::I64Object = 65`, the same as `U64Object` but for the `SCVal` case for `SCV_I64`. + - `Tag::TimepointObject = 66`, the same as `U64Object` but for the `SCVal` case for `SCV_TIMEPOINT`. + - `Tag::DurationObject = 67`, the same as `U64Object` but for the `SCVal` case for `SCV_DURATION`. + - `Tag::U128Object = 68`, the same as `U64Object` but for the `SCVal` case for `SCV_U128`. + - `Tag::I128Object = 69`, the same as `U64Object` but for the `SCVal` case for `SCV_I128`. + - `Tag::U256Object = 70`, the same as `U64Object` but for the `SCVal` case for `SCV_U256`. + - `Tag::I256Object = 71`, the same as `U64Object` but for the `SCVal` case for `SCV_I256`. + - `Tag::BytesObject = 72`, for object-handle `Val`s referring to the `ScVal` case for `SCV_BYTES`. + - `Tag::StringObject = 73`, for object-handle `Val`s referring to the `ScVal` case for `SCV_STRING`. + - `Tag::SymbolObject = 74`, for object-handle `Val`s referring to the `ScVal` case for `SCV_SYMBOL`, typically only used when the symbol is longer than 9 characters, so cannot fit in a `SymbolSmall`. + - `Tag::VecObject = 75`, for object-handle `Val`s referring to the `ScVal` case for `SCV_VEC`. + - `Tag::MapObject = 76`, for object-handle `Val`s referring to the `ScVal` case for `SCV_MAP`. + - `Tag::AddressObject = 77`, for object-handle `Val`s referring to the `ScVal` case for `SCV_ADDRESS`. + +The Rust code defining the `Tag` datatype includes some additional symbolic names for the boundaries of the assigned tag codes, as well as a sentinel for unassigned tags, but these are not part of the interface specified by this CAP. All tag values not described above are reserved for future use. #### Host object type(s) There are many different **host object types**, and we refer to the disjoint union of all possible host object types as **the host object type**. This may be implemented in terms of a variant type, an object hierarchy, or any other similar mechanism in the host. -Every host object is held in host memory and **cannot be accessed directly from guest code**. Host objects can be _referred to_ by host values in either host or guest code: specifically those values with tag 3 (object reference) refer to a host object by type code and handle. +Every host object is held in host memory and **cannot be accessed directly from guest code**. Host objects can be _referred to_ by host values in either host or guest code: specifically those values with tags between `64` and `77` inclusive refer to host objects by handle. -**Host object handles** are assigned sequentially from 1, as host objects are allocated during the lifecycle of a host execution context. Host object handle 0 is reserved as a sentinel value that always denotes an invalid object, on which no host functions are defined. All host object types share a single numerical range of handles. In other words: the type codes held in object references _reflect_ type differences between host objects, to allow guests to switch on host object types without calling host functions to query them, but the object type codes do not subdivide the numeric range of object handles. +**Host object handles** are integers that identify host objects. They come in two forms: **relative** handles and **absolute** handles. Relative handles are, as their name suggests, only meaningful _relative_ to a specific WASM VM: they are indexes into an indirection table attached to each WASM VM that maps relative handles to absolute handles. Absolute handles identify host objects within the host independently of any WASM VM. When guest code running in a WASM VM has a value of some object-handle type, it is always a _relative_ handle. When guest code calls the host, any relative handle being passed is translated to an absolute handle, and when an absolute handle is returned from the host to the guest it is translated from an absolute to a relative handle. This way guests never see absolute handles, and cannot access any host objects that they have not explicitly been passed references to (eg. as invocation arguments or return values from host functions). -There are 2^28 (268,435,456) possible **host object type codes**, of which only the first 6 are defined in this CAP: +If a host object is accessed through an invalid handle -- a number that does not identify an object -- the access fails with an error. - - Object type 0: a **box** which contains a single host value. - - Object type 1: a **vector** which contains a sequence of host values. - - Object type 2: a **map** which is an ordered association from host values to host values. - - Object type 3: an **unsigned 64-bit integer**. - - Object type 4: an **signed 64-bit integer**. - - Object type 5: a **binary** object containing unspecified bytes. +If a host object is accessed through a value with a tag that does not match the actual type of the underlying host object, the access fails with an error. While not strictly necessary -- it would be possible to simply ignore the tag -- this helps catch coding errors. Similarly if a host function expects a host object handle argument with a specific tag, and is passed a value with a different tag, it is rejected with an error even if the object handle number is valid. -Note that unlike value tags, the host object type codes _are_ the same numbers as the `SCObjectType` codes in the XDR form. That is, `SCO_VEC` has value `1` which is the same as the host object type code for vector. Maintaining common numbering limits `SCObjectType` to 2^28 possible values as well. - -This CAP defines a basic comparison operation for these types, as well as validity and conversion operations for the XDR, but no other operations. An expanded repertoire of host object types and functions that operate on them will be presented in a later CAP. +The specific operations that can be performed on each host object are defined by host functions, described in [CAP-0046-03 - Smart Contract Host Functions](./cap-0046-03.md). #### Comparison Values and objects in the data model have a total order. When comparing two values A and B: - - If A is a positive int64 and B is not, A is less than B. - - If A and B are both positive int64 values, they are ordered by the normal int64 order. - - If A and B are both tagged and if A has a lesser tag than B, A is less than B. - - If A and B are both equally tagged, then: - - If they have tag 0, they are ordered by the normal uint32 order on their low 32 bits. - - If they have tag 1, they are ordered by the normal int32 order on their low 32 bits. - - If they have tag 2, 5 or 6 or 7 they are ordered by the normal uint64 order on the zero-extension of their low 60 bits. - - If they have tag 4 they are ordered by the lexicographical order of their Unicode string value. - - If they have tag 3 they are ordered by calling `obj_cmp(A, B)` which performs deep object comparison. - -Deep object comparison can be accessed by either guest or host: it is provided to guests as a host function via the host environment interface. It performs a recursive structural comparison of objects and values embedded in objects using the following rules: - - - If A and B have different object types, they are ordered by object type code. - - If A and B are boxes, their values are ordered by the value rules above. - - If A and B are vectors, they are ordered by lexicographic extension of the value order - - If A and B are maps, they are ordered lexicographically as ordered vectors of (key, value) pairs - - If A and B are int64 or uint64, they are ordered using the normal order for those types - - If A and B are binary, they are ordered using the lexicograhical order of their respective bytes + - If both values have an equal bit-pattern, their order is equal. + - If _either_ value is an object-handle type, they are compared through object comparison (via the host function `obj_cmp`) as described below. + - Otherwise A and B are both small-value types: + - If A's `Tag` differs from B's `Tag`, they are ordered by numeric `Tag` value (which, for small values, match the order of the corresponding XDR `SCValType`s). + - Otherwise A and B have the same `Tag` value: + - If A and B have common tag `Tag::False`, `Tag::True`, `Tag::Void`, or `Tag::LedgerKeyContractInstance`, A and B are equal. + - If A and B have common tag `Tag::Error`, A and B are ordered first by their minor components (the "error type"), then by their major components (the "error code"), both treated as unsigned 32-bit integers. + - If A and B have common tag `Tag::U32Val`, A and B are ordered by their major components, treated as unsigned 32-bit integers. + - If A and B have common tag `Tag::I32Val`, A and B are ordered by their major components, treated as signed 32-bit integers. + - If A and B have common tag `Tag::U64Small`, `Tag::U128Small` or `Tag::U256Small`, A and B are ordered by their bodies, treated as unsigned 64-bit integers. + - If A and B have common tag `Tag::I64Small`, `Tag::I128Small` or `Tag::I256Small`, A and B are ordered by their bodies, treated as signed 64-bit integers. + +Object comparison can be accessed by either guest or host: it is provided to guests as a host function `obj_cmp` via the host environment interface. It performs a recursive structural comparison of objects, as well as values embedded in objects, using the following rules: + + - If A and B have the same `Tag` value, they are directly compared as objects: + - If A and B have common tag `Tag::VecObject`, they are ordered by lexicographic extension of the value order. + - If A and B have common tag `Tag::MapObject` objects, they are ordered lexicographically as ordered vectors of (key, value) pairs. + - If A and B have common tag `Tag::U64Object`, `Tag::I64Object`, `Tag::U128Object`, `Tag::I128Object`, `Tag::U256Object` or `Tag::I256Object`, they are ordered using the numerical order for those types. + - If A and B have common tag `Tag::BytesObject`, `Tag::StringObject`, `Tag::SymbolObject`, or `Tag::Address` they are ordered (recursively) in the natural order of their corresponding XDR representations: lexicographically by structure field order, sequence order, union discriminant and structure field numerical orders. + - Otherwise only one of A or B are object handles: + - If either has tag `Tag::U64Small` and the other has tag `Tag::U64Object`, both are compared as their underlying unsigned 64-bit integers. + - Similarly when comparing a combination of tags `Tag::I64Small` and `Tag::I64Object`, or `Tag::TimepointSmall` and `Tag::TimePoint`, or `Tag::DurationSmall` and `Tag::Duration`, or `Tag::U128Small` and `Tag::U128Object`, or `Tag::I128Small` and `Tag::I128Object`, or `Tag::U256Small` and `Tag::U256Object`, or `Tag::I256Small` and `Tag::I256Object`, a small-value case and large-value case of the same underlying numeric type are compared in terms of that underlying numeric type. + - Similarly if either has tag `Tag::SymbolSmall` and the other has tag `Tag::SymbolObject`, both are compared lexicographically as the underlying sequence of characters in each symbol. + - Otherwise some object type and an unrelated non-object type are being compared, so their actual values are ignored and they are compared by the numerical value of the `SCValType` of the un-refined XDR `SCVal` type they represent (i.e. both `Tag::I64Small` and `Tag::I64Object` are projected to their `SCValType` `SCV_I64` for numerical code-comparison with the `SCValType` of the other value). + #### Validity The following additional validity constraints are imposed on the XDR types. Values not conforming to these constraints are rejected during conversion to host form: - - `SCVal.pos_i64` must be >= 0. - - `SCVal.sym` must consist only of the characters `[_0-9A-Za-z]` - - `SCVal.obj` must not be empty (it is optional in the XDR only to enable type-recursion) - - `SCVal.bits` must have its most significant 4 bits set to 0. -#### Conversion - -Conversion from an XDR `SCVal` to a host value is as follows: - - Type cases other than `SCV_OBJECT` are directly encoded into their bit-packed host value form. - - For the `SCV_OBJECT` case, the contained `SCObject` is converted into a host object and placed in the host environment's host object array at the next available position `P`. The resulting host value is object handle `P`. - -Conversion from a host value to an XDR `SCVal` is as follows: - - For the bit-packed primary case 0, and for cases other than tag 3 (object) in primary case 1, each bit-packed representation is copied directly to its corresponding SCVal case. - - For tag 3 in primary case 1, the object handle value is accessed in the host environment's host object array. If the object handle has a value beyond the end of the host object array, the conversion fails with an error. Otherwise the result of conversion is an `SCVal` in `SCV_OBJECT` state, with the conversion of the located host object assigned to `SCVal.obj`. + - `SCVal.sym` must consist only of the characters `[_0-9A-Za-z]` and be no longer than `SCSYMBOL_LIMIT` (currently 32 characters). + - `SCVal.map` and `SCVal.vec` must not be empty (they are optional in the XDR only to enable type-recursion) + - `SCVal.map` must be populated by `SCMapEntry` pairs in increasing `key`-order, with no duplicate keys. -Conversion from an XDR SCObject to a host object is as follows: - - Type case `SCO_BOX` forms a host box containing the conversion of the contained SCVal. - - Type case `SCO_VEC` forms a host vector containing the conversion of each contained SCVal in order. - - Type case `SCO_MAP` forms an ordered host map and then for each pair `SCMapEntry`, adds an entry mapping the conversion of the map entry's `key` to the conversion of the map entry's `val`, returning the resulting host map once all `SCMapEntry`s are added. Note that this means that the resulting map will be in comparison order rather than the order `SCMapEntry`s were provided, and any redundant entries for the same key earlier in the array of `SCMapEntry`s will be overwritten by later entries for the same key. - - Type cases `SCO_U64`, `SCO_I64` and `SCO_BINARY` simply move their contained value into a host object with the same content, unaltered. +#### Conversion - Conversion from a host object to an XDR SCObject is as follows: - - A host box object forms an `SCObject` of type `SCO_BOX` with the conversion of its value in `SCObject.box`. - - A host vector object forms an `SCObject` of type `SCO_VEC` with the conversions of its element values in `SCObject.vec`. - - A host map object forms an `SCObject` of type `SCO_MAP` with each mapping entry converted to an `SCMapEntry` and added to the resulting `SCObject.map` field in host value comparison order, from low to high. - - A signed or unsigned int64 object, or binary object, is simply moved to its respective `SCObject` case. +Conversion from an XDR `SCVal` to a host value `Val` is as follows: + - The `true` and `false` cases of `SCV_BOOL` are separately encoded as `Val`s with `Tag::True` or `Tag::False`, and zero bodies. + - The `SCV_VOID` and `SCV_LEDGER_KEY_CONTRACT_INSTANCE` cases are encoded as `Val`s with `Tag::Void` and `Tag::LedgerKeyContractInstance`, respectively, and zero bodies. + - The `SCV_ERROR` case is encoded as a `Val` with `Tag::Error`, with the `SCErrorType` stored in the `Val`'s minor component and the major component either storing: + - The `uint32` in the `contractCode` field, if the `SCError` is in case `SCE_CONTRACT` + - Otherwise the numeric value of the `SCErrorCode` in the `code` field of all other `SCE_*` cases. + - Case `SCV_U32` is encoded as a `Val` with `Tag::U32`, with the `u32` field stored in its major component. + - Case `SCV_I32` is encoded as a `Val` with `Tag::I32`, with the `i32` field stored in its major component. + - Cases `SCV_U64`, `SCV_TIMEPOINT`, `SCV_DURATION`, `SCV_U128`, `SCV_U256` are encoded by first considering whether the underlying numeric value, when considered as an unsigned 64-bit value, fits in 56 bits. If so, it is encoded as a `Val` with `Tag::U64Small`, `Tag::TimepointSmall`, `Tag::DurationSmall`, `Tag::U128Small` or `Tag::U256Small` respectively, with the small unsigned integer value packed into the body. Otherwise they are stored as new host objects and the handle to the object is stored in the major component of a `Val` with `Tag::U64Object`, `Tag::TimepointObject`, `Tag::DurationObject`, `Tag::U128Object` or `Tag::U256Object` respectively. + - Similarly cases `SCV_I64`, `SCV_I128`, and `SCV_I256` are encoded either as the 56-bit body of `Val`s with their corresponding small value tags `Tag::I64Small`, `Tag::I128Small` or `Tag::I256Small` or as object handles in the 32-bit major component of `Val`s with their corresponding general object tags `Tag::I64Object`, `Tag::I128Object`, `Tag::I256Object` depending on whether thir underlying numeric value, when considered as a signed 64 bit value, can be encoded in 56 bits without data loss. + - Similarly case `SCV_SYMBOL` is bit-packed as 6 bit codes (as described above) in the body of a `Val` with `Tag::SymbolSmall` if the symbols length is 9 characters or less, otherwise it's stored as a new host object with its handle stored in the major component of a `Val` with `Tag::SymbolObject`. + - Cases `SCV_BYTES`, `SCV_STRING` and `SCV_ADDRESS` are each stored unconditionally as new host object, with the object handle stored as the major component of a `Val` with `Tag::Bytes`, `Tag::String`, `Tag::Map`, `Tag::Vec` and `Tag::Address` respectively. Each `SCVal` contained within the `map` or `vec` components of the container types, they are converted to host values recursively. + - Case `SCV_VEC` unconditionally stores a new host object, with the object handle stored as the major component of a `Val` with `Tag::Vec`, but only after _recursively_ converting its contained `SCVal`s to `Val`s using the same rules specified here. In other words the host object stores a vector of _converted_ `Val`s, not unconverted `SCVal`s. + - Similarly case `SCV_MAP` unconditionally stores a new host object, with the object handle stored as the major component of a `Val` with `Tag::Map`, and only after _recursively_ converting its contained `SCMapEntry`s to _pairs_ of `Val`s using the same rules specified here. In other words the host object stores a vector of pairs of _converted_ `Val`s, not unconverted `SCMapEntry`s or `SCVal`s. + - Cases `SCV_LEDGER_KEY_NONCE` and `SCV_CONTRACT_INSTANCE` are reserved for host-managed storage keys, and are only ever represented in their XDR form. They therefore do not have corresponding cases in `Tag`, so attempted conversion to `Val` fails with an error. + + +Conversion from a host value `Val` to an XDR `SCVal` is as follows: + - `Val`s with `Tag::True` or `Tag::False` are encoded as booleans in `SCVal` case `SCV_BOOL` + - `Val`s with `Tag::Void` and `Tag::LedgerKeyContractInstance` are encoded as the void `SCVal` cases `SCV_VOID` and `SCV_LEDGER_KEY_CONTRACT_INSTANCE`, respectively. + - `Val`s with case `Tag::Error` are encoded as case `SCV_ERROR` with `SCError` cases chosen by the `Val`'s major component interpreted as an `SCErrorType`: + - In case `SCE_CONTRACT`, the minor component becomes the `uint32` field `contractCode` + - In all other `SCE_*` cases, the minor component becomes the `SCErrorCode` field `code` + - `Val`s with `Tag::U32` are encoded as case `SCV_U32` with the `u32` field taken from the `Val`'s major component interpreted as an unsigned 32-bit integer. + - `Val`s with `Tag::I32` are encoded as case `SCV_U32` with the `i32` field taken from the `Val`'s major component interpreted as an signed 32-bit integer. + - `Val`s with `Tag::U64Small`, `Tag::TimepointSmall`, `Tag::DurationSmall`, `Tag::U128Small`, or `Tag::U256Small` are encoded as `SCV_U64`, `SCV_TIMEPOINT`, `SCV_DURATION`, `SCV_U128` and `SCV_U256` with their numeric values taken from the `Val`'s body interpreted as an unsigned 64-bit integer. + - Similarly, `Val`s with `Tag::I64Small`, `Tag::I128Small`, or `Tag::I256Small` are encoded as `SCV_I64`, `SCV_I128` and `SCV_I256` with their numeric values taken from the `Val`'s body interpreted as a signed 64-bit integer. + - `Val`s with `Tag::SymbolSmall` are encoded as `SCV_SYMBOL` with characters extracted from the sequence of characters bit-packed into the body of the `Val`. + - `Val`s that encode object handles are dereferenced and the underlying object is converted back to its unique `SCVal` case: `Tag::U64Object` to `SCV_U64`, `Tag::I64Object` to `SCV_I64`, `Tag::TimepointObject` to `SCV_TIMEPOINT`, `Tag::DurationObject` to `SCV_DURATION`, `Tag::U128Object` to `SCV_U128`, `Tag::I128Object` to `SCV_I128`, `Tag::U256Object` to `SCV_U256`, `Tag::I256Object` to `SCV_I256`, `Tag::SymbolObject` to `SCV_SYMBOL`, `Tag::BytesObject` to `SCV_BYTES`, `Tag::StringObject` to `SCV_STRING`, `Tag::VecObject` to `SCV_VEC`, `Tag::MapObject` to `SCV_MAP`, and `Tag::Address` to `SCV_ADDRESS`. As with conversion into `Val`, converting the container types `Tag::Vec` and `Tag::Map` back to `SCVal`s first recursively convert their contained `Val` elements to `SCVal`s, using the same rules described here. - Note that due to the re-ordering and de-duplication that occurs when converting an `SCO_MAP` `SCObject`, it is not the case that "round trip" conversions from XDR to host forms produce identical results. ## Design Rationale ### Rationale for WASM @@ -297,12 +381,12 @@ Relative to requirements listed in this CAP, WASM addresses many of them: - Compatibility: many WASM interpreters are written in C++ and/or Rust, can be embedded easily in stellar-core. - Learnability: WASM is not as familiar as EVM but is relatively widely known and appears easy to learn. -### Rationale for value / object split +### Rationale for host value / host object split -The split between values (which can traverse the host/guest interface) and objects (which remain on the host side and are managed by host functions) is justified as a response to a number of observations we made when considering existing blockchains: +The split between host value types (`Val`s that can traverse the host/guest interface) and host objects (that remain on the host side, are identified only by handles, and are managed by host functions) is justified as a response to a number of observations we made when considering existing blockchains: - Many systems spend a lot of guest code footprint (time and space) implementing data serialization and deserialization to and from opaque byte arrays. This code suffers from a variety of problems: - - It is often to and from an opaque format, making a contract's data difficult to browse or debug, and making SDKs that invoke contracts need to carry special code to serialize and deserialize data for the contract. + - It is often to and from an opaque, non-standard or contract-specific format, making a contract's data difficult to browse or debug, and making SDKs that invoke contracts need to carry special code to serialize and deserialize data for the contract. - It is often coupled to a specific version or layout of a data structure, such that data cannot be easily be migrated between versions of a contract. - It requires that a contract potentially contains extra copies of serialization support code for the formats used by any contracts it calls. - It is often intermixed with argument processing and contract logic, representing a significant class of security problems in contracts. @@ -310,13 +394,13 @@ The split between values (which can traverse the host/guest interface) and objec - Similarly, when guest code is CPU-intensive it is often performing numerical or cryptographic operations which would be better supported by a common library of efficient (native) host functions. - - As of this writing, WASM defines no mechanism of directly sharing code, which makes it impossible to reuse common guest functions needed by many contracts. Sharing common host functions is comparatively straightforward, and much more so if we define a common data model on which host functions operate. + - As of this writing, WASM defines no standardized, mature, widely-supported mechanism of directly sharing code, which makes it impossible to reuse common guest functions needed by many contracts. Possibly in the future the [WASM component model](https://github.com/WebAssembly/component-model) may present such a mechanism for sharing code between modules, but at present it is still incomplete and not widely implemented. Sharing common host functions is comparatively straightforward, and much more so if we define a common data model on which host functions operate. - The more time is spent in the guest, the more the overall system performance depends directly on the speed of the guest VM's bytecode-dispatch mechanism (a.k.a. the VM's "inner loop"). By contrast, if the guest VM spends most of its time making a sequence of host calls, the bytecode-dispatch speed of the guest VM is less of a concern. This gives us much more flexibility in choice of VM, for example to choose simple, low-latency and comparatively-secure interpreters rather than complex, high-latency and fragile JITs. Some systems mitigate these issues by providing byte-buffers of data to guests in a guaranteed input format, such as JSON. This eliminates some of the interoperability concerns but none of the efficiency concerns: the guest still spends too much time parsing input and building data structures. -Ultimately we settled on an approach in which the system will spend _as little time in the guest as possible_, and will furnish the guest with a rich enough repertoire of host objects that it should not need many or any of its own guest-local data structures. We expect that many guests will be able to run without a guest memory allocator at all. +Ultimately we settled on an approach in which the system will spend _as little time in the guest as possible_, and will furnish the guest with a rich enough repertoire of host objects that it should not need many or any of its own guest-local data structures. Our experience suggests that many guests will be able to run without a guest memory allocator at all. There are various costs and benefits to this strategy. We compared in detail to many other blockchains with different approaches before settling on this one. @@ -328,9 +412,9 @@ Costs: - Risks redundant work, guest _may_ choose to ignore host objects. Benefits: - - Much faster execution due to most logic being in C++. + - Much faster execution due to more logic being in natively-compiled host Rust code. - Smaller guest input-parsing attack surfaces to defend. - - Smaller guest data compatibility surfaces to maintain. + - Smaller guest data compatibility surfaces to maintain. - Much smaller guest code, minimizing storage and instantiation costs: - Little or no code to serialize or deserialize data in guest. - Little or no common memory-management or data structure code in guest. @@ -341,7 +425,7 @@ Benefits: - Easier to pass data from one contract to another. - Easier to use same data model from different source languages. -It is especially important to note that the (enlarged) attack and maintenance surfaces on the host are costs borne by stellar-core developers, while the (diminished) attack and maintenance surfaces are benefits that accrue to smart contract developers. We believe this is a desirable balance of costs and benefits. +It is especially important to note that the (enlarged) attack and maintenance surfaces on the host are costs borne by Soroban's developers, while the (diminished) attack and maintenance surfaces are benefits that accrue to smart contract developers. We believe this is a desirable balance of costs and benefits, as contract developers are likely to significantly outnumber Soroban developers. ### Rationale for value and object type repertoires @@ -350,21 +434,51 @@ These are chosen based on two criteria: - Reasonably-foreseeable use in a large number of smart contracts. - Widely-available implementations with efficient immutable forms. -In addition, _values_ are constrained by the ability to be packed into a 64-bit tagged disjoint union. Special cases for common small values such as symbols, booleans, 32-bit integers, status codes and small bitsets are provided on the basis of presumed utility in a variety of contexts. +In addition, _values_ are constrained by the ability to be packed into a 64-bit tagged disjoint union. Special cases for common small values such as symbols, booleans, integer types and error codes are provided on the basis of presumed utility in a variety of contexts. + +#### Numeric types + +The value repertoire includes **signed and unsigned integer types** as its sole number types: + - 32 and 64-bit types, as these are standard WASM types and useful for most purposes + - 128-bit types, which are natively supported by Rust (the host and guest language Soroban ships with support for). This type is also large enough to act as a very high precision fixed-point number for currency calculations: 19 decimal digits on either side of the decimal point. As this is larger than the standard 18 decimal places used by default by Ethereum's ERC20 token standard, 128-bit integers are used by Soroban's native contract interface as a common type for expressing quantities. + - 256-bit types, which are useful for two distinct reasons: + - For interoperation with Ethereum or other 256-bit integer blockchains + - To store and operate on various cryptographic values as scalars: several hash functions and encryption functions use 256-bit values as inputs or outputs, and it is frequently convenient to perform 256-bit integer-arithmetic or bitwise operations when working with those functions. + +Two additional integral-wrapper types -- `Duration` and `TimePoint` -- exist merely for the sake of avoiding errors and meaningful display formatting when working with time values (eg. to hint to a user interface to display a `TimePoint` as `2023-08-24T04:00:18+00:00` rather than `1692874818`). Internally both types are `u64`. + +Floating-point arithmetic is disabled in the WASM VM, and floating-point types are not used anywhere in the `SCVal` value repertoire or the host interface, out of concern for nondeterminism and survey feedback from potential users that they would not be used. + +Fixed-point arithmetic functions could potentially be provided in the host, but feedback during development indicated that most users would be doing fixed point calculations with the 128-bit type, which is expected to remain on the guest as a 128-bit guest arithmetic operation costs roughly the same amount of CPU work as a host call. Users are therefore encouraged to simply include their own fixed-point library code in contracts. Some support code for this may be added to the Soroban guest SDK. -The value tagging scheme is arranged into two levels -- an primary single-bit tag followed by a secondary 3-bit tag in one of the two primary cases -- in order to facilitate storing positive 64-bit integers in one of the primary cases, without overflowing to an object. We observe that the majority of 64-bit values in the current ledger are positive, representing (for example) asset amounts, time points and sequence numbers. +#### Container types -Implementations of the map and vector object types are based on design techniques from the functional language community, specifically [Relaxed-Radix-Balanced vectors (RRBs)](https://dl.acm.org/doi/10.1145/2784731.2784739) and [Hash Array Mapped Tries (HAMTs)](https://en.wikipedia.org/wiki/Hash_array_mapped_trie). Both of these data types support efficient "modifying copies" that produce new data structures from updates applied to old ones, while sharing most of the memory and substructure of the old object with the new one. +Implementations of the map and vector object types are based on Rust's standard vector type, are always precisely sized to their data and immutable once constructed. The map type is a sorted vector of key-value pairs that is binary searched during map lookup, but otherwise lacks any advanced structure. + +Earlier versions of this CAP suggested the use of container objects with "shared substructure" such as HAMTs, functional red-black trees or RRBs. These were used early in Soroban's development, but it was observed that most host objects were small due to pressure from the persistent storage system and transaction system, and the overhead of objects with shared substructure exceeded the cost of a simpler approach of merely duplicating objects in full every time they are modified. As a result, the simpler approach was adopted. + +Containers **are** nonetheless converted from their XDR forms to internal forms. The host's internal form of an `SCVec` is a vector of `Val` host values, each only 64 bits, rather than a vector of arbitrarily large `SCVal`s. Similar the host's internal form of an `SCMap` is a map of pairs of `Val` host values. In both cases this helps minimize the size overhead of the (frequently duplicated) host containers, and simplifies accounting for operations on them, since all `Val`s within them are the same small size. + +#### Buffer types + +Three types in the `SCVal` / `Val` repertoire are all variations on "a byte buffer": + + - `Bytes` which carries no implication about its content. This is the most general type. + - `String` which carries an implication that its content is text in some format (most likely UTF-8 unicode). No structure is _mandated_ for `String` but at a user-interface level it is often helpful to parse and display text differently from general byte sequences. + - `Symbol` is like `String` but imposes additional constraints: a maximum size of 32 characters, and a repertoire of characters drawn from the set `[a-zA-Z0-9_]`. The size limit is imposed to help support `Symbol`s in guest code without needing a heap allocator. The limited repertoire is chosen for several reasons: + - It is visually unambiguous in many typefaces, and so reduces the security risks from confusible Unicode codepoints or non-canonical code sequences, which can result in `String`s that "look the same" but contain different bytes. + - It has only 63 codes, which (combined with a code for null) is small enough to be packed into 6 bits, which in turn enables bit-packing small 9 character XDR `Symbol`s into the body of the `SymbolSmall` case of the host `Val` type, an important space optimization as `Symbol`s are relatively ubiquitous. + - It is a widely-used repertoire in surveys of the ecosystem and legacy systems: it covers most program identifiers, such as datatype and function names, as well as most asset identifier codes. ### Rationale for separate XDR and host forms It would be possible to store all data in memory in the host in its XDR format, but we choose instead to define a separate "host form" for both values and objects in this specification for the following reasons: - - In the host form, values are bit-packed in order to fit in exactly 64 bits. This bit-packing is implemented in stellar-core but is somewhat delicate and would be undesirable to reimplement in every client SDK and data browser. In the XDR form, the various cases that make up the value union are represented in a standard XDR union, which is automatically supported by many languages' XDR bindings. + - In the host form, values are bit-packed in order to fit in exactly 64 bits. This bit-packing is implemented in Rust code in the Soroban host (and _partially_ available to Rust guest code) but many parts of it are host-specific, and quite delicate, and would in any case be undesirable to reimplement in every client SDK and data browser. In the XDR form, the various cases that make up the value union are represented in a standard XDR union, which is automatically supported by many languages' XDR bindings. - - In the host form, objects and values are separated for reasons explained above, and their separation is mediated through object _references_ and the _host environment_ that maps references to objects. In the XDR form, objects and values are _not_ separated, because they should not be: there is no implicit context in which to resolve references, and even if there were it would introduce a new category of potential reference-mismatch error in the serialized form to support it. Instead, in the XDR form values _directly contain_ objects. + - In the host form, objects and values are separated for reasons explained above, and their separation is mediated through object _handles_ and the _host environment_ that maps references to objects. In the XDR form, objects and values are _not_ separated, because they should not be: there is no implicit context in which to resolve handles, and even if there were it would introduce a new category of potential handle-mismatch error in the serialized form to support it. Instead, in the XDR form values _directly contain_ objects. - - In the host form, maps and vectors are implemented using memory-efficient substructure-sharing datatypes as described above. Additionally, maps support CPU-efficient hashed lookup by key. In the XDR form, maps are simple linear arrays of key-value pairs, and neither vectors nor maps support any sort of partial substructure-sharing updates. + - As mentioned above, containers in the host form are actually be more efficient and simpler to work with having been converted from containers of XDR `SCVal`s to containers of host `Val`s. ### Rationale for immutable objects @@ -376,17 +490,28 @@ Costs: Benefits: - Reduced risk of error through mutating a shared object. - - Simple model of equality, for using structured values as map keys. + - Stable total order, for using structured values as map keys. - Simple model of security: no covert channels, only passed values. - Simple model for transactions: discard objects on rollback. -Since we expect smart contracts to run to completion very quickly, and then free all objects allocated, we do not consider the additional memory allocation cost a likely problem in practice. Furthermore as mentioned in the object-repertoire rationale above, we have been using shared-substructure types in our prototype, so most large-object updates should only consume minimal new memory. +Since we expect smart contracts to run to completion very quickly, and then free all objects allocated, we do not consider the additional memory allocation cost a likely problem in practice. Furthermore as mentioned in the object-repertoire rationale above, most objects are small. Therefore the only real risk we foresee is the increased risk of unintentionally referring to an old/stale object, and we believe this is outweighed by the reduced risk of unintentionally referring to a shared mutable object that it mutated through an alias. ## Protocol Upgrade Transition -This CAP does not introduce any protocol changes. + +The initial protocol upgrade to enable Soroban is outside the scope of this CAP, as it will simply enable Soroban transaction types where no previous Soroban transactions were allowed. + +Subsequent protocol upgrades must be carefully managed to ensure compatibility. Specifically the following mechanisms will assist in maintaining compatibility across upgrades: + + 1. Every contract must carry a custom WASM section called `contractenvmetav0`. This section must contain the serialized bytes of a sequence of the XDR type `SCEnvMetaEntry` which is a union switching on `SCEnvMetaEntryKind` that, initially, only contains a single possible case `SC_ENV_META_KIND_INTERFACE_VERSION`. This carries a `uint64` that defines an "interface version" of the contract, which encodes both a protocol version number (in the high 32 bits) and a prerelease number (in the low 32 bits). The prerelease number is only meaningful during Soroban's development and must be zero once Soroban is enabled. The SDK currently arranges to include this information automatically, based on the version of the Rust `soroban-env-common` crate it is compiled against. + 2. A contract's protocol number indicates the minimum required protocol for a contract to run, and is checked by the host when instantiating the contract: instantiating a contract with an unsupported protocol number results in an error before execution. + 3. Extensions to the host interface will always be accompanied by a protocol change. This allows contracts to be deployed before they are fully supported, and to activate only when the network votes to support new features. + 4. If the host needs to intentionally deprecate or change the behaviour of any host function or any other aspect of the host interface, it should also accompany this change with a protocol change. Since historical ledgers always specify the protocol number they were recorded under, marking different ledgers with different protocols is the intended (and only reliable) way to enable the host to switch between different forms of logic, replaying old ledgers on old backward-compatibility logic and new ledgers on new logic. + 5. To minimize the risk of _unintentional_ changes to the host's logic (and divergence among versions) entering the network due to, say, periodic software maintenance and dependency updates, the host is designed to support (and stellar-core is equipped to provide) _multiversioning_: to embed two full copies of the entire transitive tree of software dependencies of the host in process simultaneously, and to "switch over" between one version and another instantaneously, during a protocol upgrade. This allows delaying and then grouping together "all potentially risky" changes to dependencies until the next protocol-upgrade boundary, and then deploying them all simultaneously across the network. In other words, it is expected that the Soroban host will remain relatively static between protocol versions, only taking very minor updates that we have high certainty in the identical observable semantics of. + + The process of safely upgrading the network with Soroban enabled is described in more detail in [this document inside the stellar-core repository](https://github.com/stellar/stellar-core/blob/master/docs/versioning-soroban.md). ### Backwards Incompatibilities This CAP does not introduce any backward incompatibilities. @@ -424,8 +549,8 @@ TBD. See in-progress implementation. An implementation is provided in two parts: - 1. The [rs-stellar-contract-env repository](https://github.com/stellar/rs-stellar-contract-env) which contains three Rust crates defining: - - `stellar-contract-env-host`: a Rust implementation of the host environment - - `stellar-contract-env-guest`: a Rust interface for Rust guest code to interact with the host environment - - `stellar-contract-env-common`: a set of definitions common to both - 2. The [PR 3428](https://github.com/stellar/stellar-core/pull/3428) on the stellar-core repository, which provides the XDR definitions above and provides a connection between stellar-core and the `rs-stellar-contract-host` crate. + 1. The [rs-soroban-env repository](https://github.com/stellar/rs-soroban-env) which contains three Rust crates defining: + - `soroban-env-host`: a Rust implementation of the host environment + - `soroban-env-guest`: a Rust interface for Rust guest code to interact with the host environment + - `soroban-env-common`: a set of definitions common to both + 2. The [stellar-core repository](https://github.com/stellar/stellar-core/) which contains (by reference) the XDR definitions above and provides an embedding of the `soroban-env-host` crate inside `stellar-core`. diff --git a/core/cap-0046.md b/core/cap-0046.md index 20796c077..d72d6b4be 100644 --- a/core/cap-0046.md +++ b/core/cap-0046.md @@ -71,7 +71,7 @@ high-quality and low-effort developer experience for writing smart contracts. All specifications _besides_ the cumulative XDR diffs below are provided in the following sub-CAPs: - - [CAP-0046-01 (ex-0046) - Smart Contract Runtime Environment](./cap-0046-01.md) + - [CAP-0046-01 (ex-0046) - Soroban Runtime Environment](./cap-0046-01.md) covers the code and data _environment_ that smart contracts run inside, rather than their relationship to the rest of the network. This mostly relates to the new XDR files below, rather than the diffs. @@ -111,8 +111,9 @@ There are four entirely new XDR files: - [Stellar-contract.x](../contents/cap-0046/Stellar-contract.x) - [Stellar-contract-spec.x](../contents/cap-0046/Stellar-contract-spec.x) + - [Stellar-contract-meta.x](../contents/cap-0046/Stellar-contract-env-meta.x) - [Stellar-contract-env-meta.x](../contents/cap-0046/Stellar-contract-env-meta.x) - - [Stellar-contract-cost-type.x](../contents/cap-0046/Stellar-contract-cost-type.x) + - [Stellar-contract-config-setting.x](../contents/cap-0046/Stellar-contract-config-setting.x) As well as updates to several of the other XDR files, which are maintained and modified on an ongoing basis during the development of Soroban in a separate, @@ -129,37 +130,16 @@ That calculates the following difference between the `src/protocol-curr` and `src/protocol-next` directories: ```diff mddiffcheck.ignore=true -diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xdr/Stellar-contract.x src/protocol-next/xdr/Stellar-contract.x ---- src/protocol-curr/xdr/Stellar-contract.x -+++ src/protocol-next/xdr/Stellar-contract.x -@@ -250,14 +250,14 @@ - - enum SCContractCodeType - { -- SCCONTRACT_CODE_WASM = 0, -+ SCCONTRACT_CODE_WASM_REF = 0, - SCCONTRACT_CODE_TOKEN = 1 - }; - - union SCContractCode switch (SCContractCodeType type) - { --case SCCONTRACT_CODE_WASM: -- opaque wasm; -+case SCCONTRACT_CODE_WASM_REF: -+ Hash wasm_id; - case SCCONTRACT_CODE_TOKEN: - void; - }; diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xdr/Stellar-ledger-entries.x src/protocol-next/xdr/Stellar-ledger-entries.x ---- src/protocol-curr/xdr/Stellar-ledger-entries.x -+++ src/protocol-next/xdr/Stellar-ledger-entries.x -@@ -3,11 +3,11 @@ +--- src/protocol-curr/xdr/Stellar-ledger-entries.x 2023-06-27 12:08:33.568636804 -0700 ++++ src/protocol-next/xdr/Stellar-ledger-entries.x 2023-07-14 14:50:55.534242191 -0700 +@@ -3,17 +3,16 @@ // of this distribution or at http://www.apache.org/licenses/LICENSE-2.0 %#include "xdr/Stellar-types.h" +%#include "xdr/Stellar-contract.h" -+%#include "xdr/Stellar-contract-cost-type.h" - ++%#include "xdr/Stellar-contract-config-setting.h" + namespace stellar { @@ -167,7 +147,13 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd typedef opaque Thresholds[4]; typedef string string32<32>; typedef string string64<64>; -@@ -98,7 +98,10 @@ + typedef int64 SequenceNumber; +-typedef uint64 TimePoint; +-typedef uint64 Duration; + typedef opaque DataValue<64>; + typedef Hash PoolID; // SHA256(LiquidityPoolParameters) + +@@ -98,7 +97,10 @@ OFFER = 2, DATA = 3, CLAIMABLE_BALANCE = 4, @@ -179,54 +165,68 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd }; struct Signer -@@ -491,6 +494,34 @@ +@@ -491,6 +493,60 @@ body; }; ++enum ContractEntryBodyType { ++ DATA_ENTRY = 0, ++ EXPIRATION_EXTENSION = 1 ++}; ++ ++const MASK_CONTRACT_DATA_FLAGS_V20 = 0x1; ++ ++enum ContractDataFlags { ++ // When set, the given entry does not recieve automatic expiration bumps ++ // on access. Note that entries can still be bumped manually via the footprint. ++ NO_AUTOBUMP = 0x1 ++}; ++ ++enum ContractDataDurability { ++ TEMPORARY = 0, ++ PERSISTENT = 1 ++}; ++ +struct ContractDataEntry { -+ Hash contractID; ++ SCAddress contract; + SCVal key; -+ SCVal val; ++ ContractDataDurability durability; ++ ++ union switch (ContractEntryBodyType bodyType) ++ { ++ case DATA_ENTRY: ++ struct ++ { ++ uint32 flags; ++ SCVal val; ++ } data; ++ case EXPIRATION_EXTENSION: ++ void; ++ } body; ++ ++ uint32 expirationLedgerSeq; +}; + +struct ContractCodeEntry { ++ ExtensionPoint ext; ++ + Hash hash; -+ opaque code; -+ union switch (int v) ++ union switch (ContractEntryBodyType bodyType) + { -+ case 0: ++ case DATA_ENTRY: ++ opaque code<>; ++ case EXPIRATION_EXTENSION: + void; -+ } -+ ext; -+}; ++ } body; + -+enum ConfigSettingID -+{ -+ CONFIG_SETTING_CONTRACT_MAX_SIZE_BYTES = 0, -+ CONFIG_SETTING_CONTRACT_BUDGET_CPU_INSTRUCTIONS = 1, -+ CONFIG_SETTING_CONTRACT_COST_PARAMS_CPU_INSTRUCTIONS = 3, -+ CONFIG_SETTING_CONTRACT_BUDGET_MEMORY_BYTES = 2, -+ CONFIG_SETTING_CONTRACT_COST_PARAMS_MEMORY_BYTES = 4 ++ uint32 expirationLedgerSeq; +}; + -+union ConfigSettingEntry switch (ConfigSettingID configSettingID) -+{ -+case CONFIG_SETTING_CONTRACT_MAX_SIZE_BYTES: -+ uint32 contractMaxSizeBytes; -+case CONFIG_SETTING_CONTRACT_BUDGET_CPU_INSTRUCTIONS: -+ uint64 contractBudgetCpuInsns; -+case CONFIG_SETTING_CONTRACT_COST_PARAMS_CPU_INSTRUCTIONS: -+ ContractCostParams contractCostParamsCpuInsns; -+case CONFIG_SETTING_CONTRACT_BUDGET_MEMORY_BYTES: -+ uint64 contractBudgetMemBytes; -+case CONFIG_SETTING_CONTRACT_COST_PARAMS_MEMORY_BYTES: -+ ContractCostParams contractCostParamsMemBytes; -+}; + struct LedgerEntryExtensionV1 { SponsorshipDescriptor sponsoringID; -@@ -521,6 +552,12 @@ +@@ -521,6 +577,12 @@ ClaimableBalanceEntry claimableBalance; case LIQUIDITY_POOL: LiquidityPoolEntry liquidityPool; @@ -239,20 +239,23 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd } data; -@@ -575,6 +612,22 @@ +@@ -575,6 +637,25 @@ { PoolID liquidityPoolID; } liquidityPool; +case CONTRACT_DATA: + struct + { -+ Hash contractID; ++ SCAddress contract; + SCVal key; ++ ContractDataDurability durability; ++ ContractEntryBodyType bodyType; + } contractData; +case CONTRACT_CODE: + struct + { + Hash hash; ++ ContractEntryBodyType bodyType; + } contractCode; +case CONFIG_SETTING: + struct @@ -262,22 +265,19 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd }; // list of all envelope types used in the application -@@ -589,6 +642,11 @@ +@@ -589,6 +670,8 @@ ENVELOPE_TYPE_SCPVALUE = 4, ENVELOPE_TYPE_TX_FEE_BUMP = 5, ENVELOPE_TYPE_OP_ID = 6, - ENVELOPE_TYPE_POOL_REVOKE_OP_ID = 7 + ENVELOPE_TYPE_POOL_REVOKE_OP_ID = 7, -+ ENVELOPE_TYPE_CONTRACT_ID_FROM_ED25519 = 8, -+ ENVELOPE_TYPE_CONTRACT_ID_FROM_CONTRACT = 9, -+ ENVELOPE_TYPE_CONTRACT_ID_FROM_ASSET = 10, -+ ENVELOPE_TYPE_CONTRACT_ID_FROM_SOURCE_ACCOUNT = 11, -+ ENVELOPE_TYPE_CREATE_CONTRACT_ARGS = 12 ++ ENVELOPE_TYPE_CONTRACT_ID = 8, ++ ENVELOPE_TYPE_SOROBAN_AUTHORIZATION = 9 }; } diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xdr/Stellar-ledger.x src/protocol-next/xdr/Stellar-ledger.x ---- src/protocol-curr/xdr/Stellar-ledger.x -+++ src/protocol-next/xdr/Stellar-ledger.x +--- src/protocol-curr/xdr/Stellar-ledger.x 2023-06-27 12:08:33.572636794 -0700 ++++ src/protocol-next/xdr/Stellar-ledger.x 2023-08-03 13:12:14.983930940 -0700 @@ -47,13 +47,17 @@ ext; }; @@ -298,22 +298,33 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd }; struct LedgerHeaderExtensionV1 -@@ -122,7 +126,8 @@ +@@ -122,7 +126,14 @@ LEDGER_UPGRADE_BASE_FEE = 2, LEDGER_UPGRADE_MAX_TX_SET_SIZE = 3, LEDGER_UPGRADE_BASE_RESERVE = 4, - LEDGER_UPGRADE_FLAGS = 5 + LEDGER_UPGRADE_FLAGS = 5, -+ LEDGER_UPGRADE_CONFIG = 6 ++ LEDGER_UPGRADE_CONFIG = 6, ++ LEDGER_UPGRADE_MAX_SOROBAN_TX_SET_SIZE = 7 ++}; ++ ++struct ConfigUpgradeSetKey { ++ Hash contractID; ++ Hash contentHash; }; union LedgerUpgrade switch (LedgerUpgradeType type) -@@ -137,6 +142,12 @@ +@@ -137,6 +148,17 @@ uint32 newBaseReserve; // update baseReserve case LEDGER_UPGRADE_FLAGS: uint32 newFlags; // update flags +case LEDGER_UPGRADE_CONFIG: -+ Hash configUpgradeSetHash; ++ // Update arbitray `ConfigSetting` entries identified by the key. ++ ConfigUpgradeSetKey newConfig; ++case LEDGER_UPGRADE_MAX_SOROBAN_TX_SET_SIZE: ++ // Update ConfigSettingContractExecutionLanesV0.ledgerMaxTxCount without ++ // using `LEDGER_UPGRADE_CONFIG`. ++ uint32 newMaxSorobanTxSetSize; +}; + +struct ConfigUpgradeSet { @@ -321,47 +332,15 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd }; /* Entries used to define the bucket list */ -@@ -264,6 +275,32 @@ - ext; - }; - -+struct TransactionResultPairV2 -+{ -+ Hash transactionHash; -+ Hash hashOfMetaHashes; // hash of hashes in TransactionMetaV3 -+ // TransactionResult is in the meta -+}; -+ -+struct TransactionResultSetV2 -+{ -+ TransactionResultPairV2 results<>; -+}; -+ -+struct TransactionHistoryResultEntryV2 -+{ -+ uint32 ledgerSeq; -+ TransactionResultSetV2 txResultSet; -+ -+ // reserved for future use -+ union switch (int v) -+ { -+ case 0: -+ void; -+ } -+ ext; -+}; -+ - struct LedgerHeaderHistoryEntry - { - Hash hash; -@@ -348,6 +385,48 @@ +@@ -348,6 +370,74 @@ // applied if any }; +enum ContractEventType +{ + SYSTEM = 0, -+ CONTRACT = 1 ++ CONTRACT = 1, ++ DIAGNOSTIC = 2 +}; + +struct ContractEvent @@ -378,32 +357,57 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd + case 0: + struct + { -+ SCVec topics; ++ SCVal topics<>; + SCVal data; + } v0; + } + body; +}; + ++struct DiagnosticEvent ++{ ++ bool inSuccessfulContractCall; ++ ContractEvent event; ++}; ++ ++struct SorobanTransactionMeta ++{ ++ ExtensionPoint ext; ++ ++ ContractEvent events<>; // custom events populated by the ++ // contracts themselves. ++ SCVal returnValue; // return value of the host fn invocation ++ ++ // Diagnostics events that are not hashed. ++ // This will contain all contract and diagnostic events. Even ones ++ // that were emitted in a failed contract call. ++ DiagnosticEvent diagnosticEvents<>; ++}; ++ +struct TransactionMetaV3 +{ -+ LedgerEntryChanges txChangesBefore; // tx level changes before operations -+ // are applied if any -+ OperationMeta operations<>; // meta for each operation -+ LedgerEntryChanges txChangesAfter; // tx level changes after operations are -+ // applied if any -+ ContractEvent events<>; // custom events populated by the -+ // contracts themselves -+ TransactionResult txResult; ++ ExtensionPoint ext; ++ ++ LedgerEntryChanges txChangesBefore; // tx level changes before operations ++ // are applied if any ++ OperationMeta operations<>; // meta for each operation ++ LedgerEntryChanges txChangesAfter; // tx level changes after operations are ++ // applied if any ++ SorobanTransactionMeta* sorobanMeta; // Soroban-specific meta (only for ++ // Soroban transactions). ++}; + -+ Hash hashes[3]; // stores sha256(txChangesBefore, operations, txChangesAfter), -+ // sha256(events), and sha256(txResult) ++// This is in Stellar-ledger.x to due to a circular dependency ++struct InvokeHostFunctionSuccessPreImage ++{ ++ SCVal returnValue; ++ ContractEvent events<>; +}; + // this is the meta produced when applying transactions // it does not include pre-apply updates such as fees union TransactionMeta switch (int v) -@@ -358,6 +437,8 @@ +@@ -358,6 +448,8 @@ TransactionMetaV1 v1; case 2: TransactionMetaV2 v2; @@ -412,41 +416,41 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd }; // This struct groups together changes on a per transaction basis -@@ -370,6 +451,13 @@ - TransactionMeta txApplyProcessing; - }; - -+struct TransactionResultMetaV2 -+{ -+ TransactionResultPairV2 result; -+ LedgerEntryChanges feeProcessing; -+ TransactionMeta txApplyProcessing; -+}; -+ - // this represents a single upgrade that was performed as part of a ledger - // upgrade - struct UpgradeEntryMeta -@@ -414,11 +502,32 @@ +@@ -414,11 +506,46 @@ SCPHistoryEntry scpInfo<>; }; -+// only difference between V1 and V2 is this uses TransactionResultMetaV2 +struct LedgerCloseMetaV2 +{ ++ // We forgot to add an ExtensionPoint in v1 but at least ++ // we can add one now in v2. ++ ExtensionPoint ext; ++ + LedgerHeaderHistoryEntry ledgerHeader; -+ ++ + GeneralizedTransactionSet txSet; + + // NB: transactions are sorted in apply order here + // fees for all transactions are processed first + // followed by applying transactions -+ TransactionResultMetaV2 txProcessing<>; ++ TransactionResultMeta txProcessing<>; + + // upgrades are applied last + UpgradeEntryMeta upgradesProcessing<>; + + // other misc information attached to the ledger close + SCPHistoryEntry scpInfo<>; ++ ++ // Size in bytes of BucketList, to support downstream ++ // systems calculating storage fees correctly. ++ uint64 totalByteSizeOfBucketList; ++ ++ // Expired temp keys that are being evicted at this ledger. ++ LedgerKey evictedTemporaryLedgerKeys<>; ++ ++ // Expired restorable ledger entries that are being ++ // evicted at this ledger. ++ LedgerEntry evictedPersistentLedgerEntries<>; +}; + union LedgerCloseMeta switch (int v) @@ -459,48 +463,10 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd + LedgerCloseMetaV2 v2; }; } -diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xdr/Stellar-overlay.x src/protocol-next/xdr/Stellar-overlay.x ---- src/protocol-curr/xdr/Stellar-overlay.x -+++ src/protocol-next/xdr/Stellar-overlay.x -@@ -83,7 +83,7 @@ - uint32 numFailures; - }; - --// Next ID: 18 -+// Next ID: 20 - enum MessageType - { - ERROR_MSG = 0, -@@ -113,7 +113,11 @@ - - SEND_MORE = 16, - FLOOD_ADVERT = 18, -- FLOOD_DEMAND = 19 -+ FLOOD_DEMAND = 19, -+ -+ // Configuration upgrades -+ GET_CONFIG_UPGRADE_SET = 20, -+ CONFIG_UPGRADE_SET = 21 - }; - - struct DontHave -@@ -243,6 +247,11 @@ - case SURVEY_RESPONSE: - SignedSurveyResponseMessage signedSurveyResponseMessage; - -+case GET_CONFIG_UPGRADE_SET: -+ uint256 configUgradeSetHash; -+case CONFIG_UPGRADE_SET: -+ ConfigUpgradeSet configUpgradeSet; -+ - // SCP - case GET_SCP_QUORUMSET: - uint256 qSetHash; -Only in src/protocol-curr/xdr: Stellar-overlay.x.bak diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xdr/Stellar-transaction.x src/protocol-next/xdr/Stellar-transaction.x ---- src/protocol-curr/xdr/Stellar-transaction.x -+++ src/protocol-next/xdr/Stellar-transaction.x -@@ -2,6 +2,7 @@ +--- src/protocol-curr/xdr/Stellar-transaction.x 2023-06-27 12:08:33.572636794 -0700 ++++ src/protocol-next/xdr/Stellar-transaction.x 2023-08-03 13:12:14.983930940 -0700 +@@ -2,11 +2,15 @@ // under the Apache License, Version 2.0. See the COPYING file at the root // of this distribution or at http://www.apache.org/licenses/LICENSE-2.0 @@ -508,31 +474,27 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd %#include "xdr/Stellar-ledger-entries.h" namespace stellar -@@ -32,6 +33,13 @@ - Signature signature; // actual signature - }; + { -+// Ledger key sets touched by a smart contract transaction. -+struct LedgerFootprint -+{ -+ LedgerKey readOnly<>; -+ LedgerKey readWrite<>; -+}; ++// maximum number of operations per transaction ++const MAX_OPS_PER_TX = 100; + - enum OperationType + union LiquidityPoolParameters switch (LiquidityPoolType type) { - CREATE_ACCOUNT = 0, -@@ -57,7 +65,8 @@ + case LIQUIDITY_POOL_CONSTANT_PRODUCT: +@@ -57,7 +61,10 @@ CLAWBACK_CLAIMABLE_BALANCE = 20, SET_TRUST_LINE_FLAGS = 21, LIQUIDITY_POOL_DEPOSIT = 22, - LIQUIDITY_POOL_WITHDRAW = 23 + LIQUIDITY_POOL_WITHDRAW = 23, -+ INVOKE_HOST_FUNCTION = 24 ++ INVOKE_HOST_FUNCTION = 24, ++ BUMP_FOOTPRINT_EXPIRATION = 25, ++ RESTORE_FOOTPRINT = 26 }; /* CreateAccount -@@ -465,6 +474,91 @@ +@@ -465,6 +472,141 @@ int64 minAmountB; // minimum amount of second asset to withdraw }; @@ -540,141 +502,233 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd +{ + HOST_FUNCTION_TYPE_INVOKE_CONTRACT = 0, + HOST_FUNCTION_TYPE_CREATE_CONTRACT = 1, -+ HOST_FUNCTION_TYPE_INSTALL_CONTRACT_CODE = 2 ++ HOST_FUNCTION_TYPE_UPLOAD_CONTRACT_WASM = 2 +}; + -+enum ContractIDType ++enum ContractIDPreimageType +{ -+ CONTRACT_ID_FROM_PUBLIC_KEY = 0, -+ CONTRACT_ID_FROM_ASSET = 1 ++ CONTRACT_ID_PREIMAGE_FROM_ADDRESS = 0, ++ CONTRACT_ID_PREIMAGE_FROM_ASSET = 1 +}; + -+enum ContractIDPublicKeyType ++union ContractIDPreimage switch (ContractIDPreimageType type) +{ -+ CONTRACT_ID_PUBLIC_KEY_SOURCE_ACCOUNT = 0, -+ CONTRACT_ID_PUBLIC_KEY_ED25519 = 1 ++case CONTRACT_ID_PREIMAGE_FROM_ADDRESS: ++ struct ++ { ++ SCAddress address; ++ uint256 salt; ++ } fromAddress; ++case CONTRACT_ID_PREIMAGE_FROM_ASSET: ++ Asset fromAsset; +}; + -+struct InstallContractCodeArgs ++struct CreateContractArgs +{ -+ opaque code; ++ ContractIDPreimage contractIDPreimage; ++ ContractExecutable executable; +}; + -+enum CreateContractSourceType { -+ CONTRACT_SOURCE_REF = 0, -+ CONTRACT_SOURCE_INSTALLED = 1 ++struct InvokeContractArgs { ++ SCAddress contractAddress; ++ SCSymbol functionName; ++ SCVal args<>; +}; + -+union CreateContractSource switch (CreateContractSourceType type) ++union HostFunction switch (HostFunctionType type) +{ -+case CONTRACT_SOURCE_REF: -+ SCContractCode codeRef; -+case CONTRACT_SOURCE_INSTALLED: -+ InstallContractCodeArgs installContractCodeArgs; ++case HOST_FUNCTION_TYPE_INVOKE_CONTRACT: ++ InvokeContractArgs invokeContract; ++case HOST_FUNCTION_TYPE_CREATE_CONTRACT: ++ CreateContractArgs createContract; ++case HOST_FUNCTION_TYPE_UPLOAD_CONTRACT_WASM: ++ opaque wasm<>; +}; + -+union ContractIDPublicKey switch (ContractIDPublicKeyType type) ++enum SorobanAuthorizedFunctionType +{ -+case CONTRACT_ID_PUBLIC_KEY_SOURCE_ACCOUNT: -+ void; -+case CONTRACT_ID_PUBLIC_KEY_ED25519: -+ struct -+ { -+ uint256 key; -+ Signature signature; -+ } ed25519KeyWithSignature; ++ SOROBAN_AUTHORIZED_FUNCTION_TYPE_CONTRACT_FN = 0, ++ SOROBAN_AUTHORIZED_FUNCTION_TYPE_CREATE_CONTRACT_HOST_FN = 1 +}; + -+union ContractID switch (ContractIDType type) ++union SorobanAuthorizedFunction switch (SorobanAuthorizedFunctionType type) +{ -+case CONTRACT_ID_FROM_PUBLIC_KEY: -+ struct -+ { -+ ContractIDPublicKey keySource; -+ uint256 salt; -+ } publicKey; -+case CONTRACT_ID_FROM_ASSET: -+ Asset asset; ++case SOROBAN_AUTHORIZED_FUNCTION_TYPE_CONTRACT_FN: ++ InvokeContractArgs contractFn; ++case SOROBAN_AUTHORIZED_FUNCTION_TYPE_CREATE_CONTRACT_HOST_FN: ++ CreateContractArgs createContractHostFn; +}; + -+struct CreateContractArgs ++struct SorobanAuthorizedInvocation +{ -+ ContractID contractID; -+ CreateContractSource source; ++ SorobanAuthorizedFunction function; ++ SorobanAuthorizedInvocation subInvocations<>; +}; + -+union HostFunction switch (HostFunctionType type) ++struct SorobanAddressCredentials +{ -+case HOST_FUNCTION_TYPE_INVOKE_CONTRACT: -+ SCVec invokeArgs; -+case HOST_FUNCTION_TYPE_CREATE_CONTRACT: -+ CreateContractArgs createContractArgs; -+case HOST_FUNCTION_TYPE_INSTALL_CONTRACT_CODE: -+ InstallContractCodeArgs installContractCodeArgs; ++ SCAddress address; ++ int64 nonce; ++ uint32 signatureExpirationLedger; ++ SCVal signature; +}; + ++enum SorobanCredentialsType ++{ ++ SOROBAN_CREDENTIALS_SOURCE_ACCOUNT = 0, ++ SOROBAN_CREDENTIALS_ADDRESS = 1 ++}; ++ ++union SorobanCredentials switch (SorobanCredentialsType type) ++{ ++case SOROBAN_CREDENTIALS_SOURCE_ACCOUNT: ++ void; ++case SOROBAN_CREDENTIALS_ADDRESS: ++ SorobanAddressCredentials address; ++}; ++ ++/* Unit of authorization data for Soroban. ++ ++ Represents an authorization for executing the tree of authorized contract ++ and/or host function calls by the user defined by `credentials`. ++*/ ++struct SorobanAuthorizationEntry ++{ ++ SorobanCredentials credentials; ++ SorobanAuthorizedInvocation rootInvocation; ++}; ++ ++/* Upload WASM, create, and invoke contracts in Soroban. ++ ++ Threshold: med ++ Result: InvokeHostFunctionResult ++*/ +struct InvokeHostFunctionOp +{ -+ // The host function to invoke -+ HostFunction function; -+ // The footprint for this invocation -+ LedgerFootprint footprint; ++ // Host function to invoke. ++ HostFunction hostFunction; ++ // Per-address authorizations for this host function. ++ SorobanAuthorizationEntry auth<>; ++}; ++ ++/* Bump the expiration ledger of the entries specified in the readOnly footprint ++ so they'll expire at least ledgersToExpire ledgers from lcl. ++ ++ Threshold: med ++ Result: BumpFootprintExpirationResult ++*/ ++struct BumpFootprintExpirationOp ++{ ++ ExtensionPoint ext; ++ uint32 ledgersToExpire; ++}; ++ ++/* Restore the expired or evicted entries specified in the readWrite footprint. ++ ++ Threshold: med ++ Result: RestoreFootprintOp ++*/ ++struct RestoreFootprintOp ++{ ++ ExtensionPoint ext; +}; + /* An operation is the lowest unit of work that a transaction does */ struct Operation { -@@ -523,6 +617,8 @@ +@@ -523,6 +665,12 @@ LiquidityPoolDepositOp liquidityPoolDepositOp; case LIQUIDITY_POOL_WITHDRAW: LiquidityPoolWithdrawOp liquidityPoolWithdrawOp; + case INVOKE_HOST_FUNCTION: + InvokeHostFunctionOp invokeHostFunctionOp; ++ case BUMP_FOOTPRINT_EXPIRATION: ++ BumpFootprintExpirationOp bumpFootprintExpirationOp; ++ case RESTORE_FOOTPRINT: ++ RestoreFootprintOp restoreFootprintOp; } body; }; -@@ -545,6 +641,40 @@ +@@ -540,11 +688,25 @@ + struct + { + AccountID sourceAccount; +- SequenceNumber seqNum; ++ SequenceNumber seqNum; + uint32 opNum; PoolID liquidityPoolID; Asset asset; } revokeID; -+case ENVELOPE_TYPE_CONTRACT_ID_FROM_ED25519: -+ struct -+ { -+ Hash networkID; -+ uint256 ed25519; -+ uint256 salt; -+ } ed25519ContractID; -+case ENVELOPE_TYPE_CONTRACT_ID_FROM_CONTRACT: ++case ENVELOPE_TYPE_CONTRACT_ID: + struct + { + Hash networkID; -+ Hash contractID; -+ uint256 salt; ++ ContractIDPreimage contractIDPreimage; + } contractID; -+case ENVELOPE_TYPE_CONTRACT_ID_FROM_ASSET: -+ struct -+ { -+ Hash networkID; -+ Asset asset; -+ } fromAsset; -+case ENVELOPE_TYPE_CONTRACT_ID_FROM_SOURCE_ACCOUNT: ++case ENVELOPE_TYPE_SOROBAN_AUTHORIZATION: + struct + { + Hash networkID; -+ AccountID sourceAccount; -+ uint256 salt; -+ } sourceAccountContractID; -+case ENVELOPE_TYPE_CREATE_CONTRACT_ARGS: -+ struct -+ { -+ Hash networkID; -+ CreateContractSource source; -+ uint256 salt; -+ } createContractArgs; ++ int64 nonce; ++ uint32 signatureExpirationLedger; ++ SorobanAuthorizedInvocation invocation; ++ } sorobanAuthorization; }; enum MemoType -@@ -1588,6 +1718,25 @@ +@@ -632,8 +794,40 @@ + PreconditionsV2 v2; + }; + +-// maximum number of operations per transaction +-const MAX_OPS_PER_TX = 100; ++// Ledger key sets touched by a smart contract transaction. ++struct LedgerFootprint ++{ ++ LedgerKey readOnly<>; ++ LedgerKey readWrite<>; ++}; ++ ++// Resource limits for a Soroban transaction. ++// The transaction will fail if it exceeds any of these limits. ++struct SorobanResources ++{ ++ // The ledger footprint of the transaction. ++ LedgerFootprint footprint; ++ // The maximum number of instructions this transaction can use ++ uint32 instructions; ++ ++ // The maximum number of bytes this transaction can read from ledger ++ uint32 readBytes; ++ // The maximum number of bytes this transaction can write to ledger ++ uint32 writeBytes; ++ ++ // Maximum size of the contract events (serialized to XDR) this transaction ++ // can emit. ++ uint32 contractEventsSizeBytes; ++}; ++ ++// The transaction extension for Soroban. ++struct SorobanTransactionData ++{ ++ ExtensionPoint ext; ++ SorobanResources resources; ++ // Portion of transaction `fee` allocated to refundable fees. ++ int64 refundableFee; ++}; + + // TransactionV0 is a transaction with the AccountID discriminant stripped off, + // leaving a raw ed25519 public key to identify the source account. This is used +@@ -695,6 +889,8 @@ + { + case 0: + void; ++ case 1: ++ SorobanTransactionData sorobanData; + } + ext; + }; +@@ -1588,6 +1784,67 @@ void; }; @@ -685,34 +739,117 @@ diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xd + + // codes considered as "failure" for the operation + INVOKE_HOST_FUNCTION_MALFORMED = -1, -+ INVOKE_HOST_FUNCTION_TRAPPED = -2 ++ INVOKE_HOST_FUNCTION_TRAPPED = -2, ++ INVOKE_HOST_FUNCTION_RESOURCE_LIMIT_EXCEEDED = -3, ++ INVOKE_HOST_FUNCTION_ENTRY_EXPIRED = -4 +}; + +union InvokeHostFunctionResult switch (InvokeHostFunctionResultCode code) +{ +case INVOKE_HOST_FUNCTION_SUCCESS: -+ SCVal success; ++ Hash success; // sha256(InvokeHostFunctionSuccessPreImage) +case INVOKE_HOST_FUNCTION_MALFORMED: +case INVOKE_HOST_FUNCTION_TRAPPED: ++case INVOKE_HOST_FUNCTION_RESOURCE_LIMIT_EXCEEDED: ++case INVOKE_HOST_FUNCTION_ENTRY_EXPIRED: ++ void; ++}; ++ ++enum BumpFootprintExpirationResultCode ++{ ++ // codes considered as "success" for the operation ++ BUMP_FOOTPRINT_EXPIRATION_SUCCESS = 0, ++ ++ // codes considered as "failure" for the operation ++ BUMP_FOOTPRINT_EXPIRATION_MALFORMED = -1, ++ BUMP_FOOTPRINT_EXPIRATION_RESOURCE_LIMIT_EXCEEDED = -2 ++}; ++ ++union BumpFootprintExpirationResult switch (BumpFootprintExpirationResultCode code) ++{ ++case BUMP_FOOTPRINT_EXPIRATION_SUCCESS: ++ void; ++case BUMP_FOOTPRINT_EXPIRATION_MALFORMED: ++case BUMP_FOOTPRINT_EXPIRATION_RESOURCE_LIMIT_EXCEEDED: ++ void; ++}; ++ ++enum RestoreFootprintResultCode ++{ ++ // codes considered as "success" for the operation ++ RESTORE_FOOTPRINT_SUCCESS = 0, ++ ++ // codes considered as "failure" for the operation ++ RESTORE_FOOTPRINT_MALFORMED = -1, ++ RESTORE_FOOTPRINT_RESOURCE_LIMIT_EXCEEDED = -2 ++}; ++ ++union RestoreFootprintResult switch (RestoreFootprintResultCode code) ++{ ++case RESTORE_FOOTPRINT_SUCCESS: ++ void; ++case RESTORE_FOOTPRINT_MALFORMED: ++case RESTORE_FOOTPRINT_RESOURCE_LIMIT_EXCEEDED: + void; +}; + /* High level Operation Result */ enum OperationResultCode { -@@ -1654,6 +1803,8 @@ +@@ -1654,6 +1911,12 @@ LiquidityPoolDepositResult liquidityPoolDepositResult; case LIQUIDITY_POOL_WITHDRAW: LiquidityPoolWithdrawResult liquidityPoolWithdrawResult; + case INVOKE_HOST_FUNCTION: + InvokeHostFunctionResult invokeHostFunctionResult; ++ case BUMP_FOOTPRINT_EXPIRATION: ++ BumpFootprintExpirationResult bumpFootprintExpirationResult; ++ case RESTORE_FOOTPRINT: ++ RestoreFootprintResult restoreFootprintResult; } tr; case opBAD_AUTH: +@@ -1689,7 +1952,9 @@ + txBAD_SPONSORSHIP = -14, // sponsorship not confirmed + txBAD_MIN_SEQ_AGE_OR_GAP = + -15, // minSeqAge or minSeqLedgerGap conditions not met +- txMALFORMED = -16 // precondition is invalid ++ txMALFORMED = -16, // precondition is invalid ++ // declared Soroban resource usage exceeds the network limit ++ txSOROBAN_RESOURCE_LIMIT_EXCEEDED = -17 + }; + + // InnerTransactionResult must be binary compatible with TransactionResult +@@ -1720,6 +1985,7 @@ + case txBAD_SPONSORSHIP: + case txBAD_MIN_SEQ_AGE_OR_GAP: + case txMALFORMED: ++ case txSOROBAN_RESOURCE_LIMIT_EXCEEDED: + void; + } + result; +@@ -1766,6 +2032,7 @@ + case txBAD_SPONSORSHIP: + case txBAD_MIN_SEQ_AGE_OR_GAP: + case txMALFORMED: ++ case txSOROBAN_RESOURCE_LIMIT_EXCEEDED: + void; + } + result; diff -ru '--exclude=*.h' '--exclude=.git*' '--exclude=*.md' src/protocol-curr/xdr/Stellar-types.x src/protocol-next/xdr/Stellar-types.x ---- src/protocol-curr/xdr/Stellar-types.x -+++ src/protocol-next/xdr/Stellar-types.x -@@ -79,6 +79,7 @@ +--- src/protocol-curr/xdr/Stellar-types.x 2023-06-27 12:08:33.572636794 -0700 ++++ src/protocol-next/xdr/Stellar-types.x 2023-07-14 14:50:55.538242159 -0700 +@@ -14,6 +14,9 @@ + typedef unsigned hyper uint64; + typedef hyper int64; + ++typedef uint64 TimePoint; ++typedef uint64 Duration; ++ + // An ExtensionPoint is always marshaled as a 32-bit 0 value. At a + // later point, it can be replaced by a different union so as to + // extend a structure. +@@ -79,6 +82,7 @@ typedef opaque SignatureHint[4]; typedef PublicKey NodeID; From 08b8849c23cb6c98e42227a788fbe95eb59b6067 Mon Sep 17 00:00:00 2001 From: Yuri Escalianti Date: Thu, 24 Aug 2023 19:34:09 -0300 Subject: [PATCH 3/7] SEP-40: Asset enum instead of Address (#1357) * SEP-40 u128 and Bytes proposal * Language improvements * Markdown fix * reverted u128 to i128, changed Bytes to Asset * Added Asset::ISO4217 * updated docs to match new enum * simplified Asset struct --- ecosystem/sep-0040.md | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/ecosystem/sep-0040.md b/ecosystem/sep-0040.md index caa159d6c..7ec795b40 100644 --- a/ecosystem/sep-0040.md +++ b/ecosystem/sep-0040.md @@ -51,22 +51,29 @@ The price feed contract should follow the `PriceFeedTrait` interface defined her ```rust /// Price data for an asset at a specific timestamp +#[contracttype] pub struct PriceData { price: i128, timestamp: u64 } +#[contracttype] +enum Asset { + Stellar(Address), + Other(Symbol), +} + /// Oracle feed interface description pub trait PriceFeedTrait { /// Return the base asset the price is reported in fn base( env: soroban_sdk::Env, - ) -> Address; + ) -> Asset; /// Return all assets quoted by the price feed fn assets( env: soroban_sdk::Env, - ) -> Vec
; + ) -> Vec; /// Return the number of decimals for all assets quoted by the oracle fn decimals( @@ -81,25 +88,30 @@ pub trait PriceFeedTrait { /// Get price in base asset at specific timestamp fn price( env: soroban_sdk::Env, - asset: Address, + asset: Asset, timestamp: u64 ) -> Option; /// Get last N price records fn prices( env: soroban_sdk::Env, - asset: Address, + asset: Asset, records: u32 ) -> Option>; /// Get the most recent price for an asset fn lastprice( env: soroban_sdk::Env, - asset: Address, + asset: Asset, ) -> Option; } ``` +Assets are represented as the struct `Asset`, in the following formats: + +- Stellar assets: `Asset::Stellar` containing the asset unique address in Soroban network, obtained when an asset from Stellar Classic is deployed to Soroban via the [Stellar Asset Contract](https://soroban.stellar.org/docs/how-to-guides/stellar-asset-contract). E.g. `d93f5c7bb0ebc4a9c8f727c5cebc4e41194d38257e1d0d910356b43bfc528813`. +- Other assets (e.g. fiat off-chain currencies, mutual funds tokens, etc): `Asset::Other` containing the currency or token name. Since `Asset::Other` can be any `Symbol`, each Oracle can define it's own format for representing off-chain assets. + ### Contract Address The price feed aggregator contract should have a stable contract address. That is, it must be upgradeable without changing the contract From d2976c6611458843c76f131bba697aea9a3a4c64 Mon Sep 17 00:00:00 2001 From: Dmytro Kozhevin Date: Thu, 24 Aug 2023 19:24:59 -0400 Subject: [PATCH 4/7] Update CAP-0046-07 (Soroban fees) (#1384) * Update CAP-0046-07 (Soroban fees) This mostly brings the specifications up to date with the current state and also adds a bit of details on runtime enforcement. * Add expiration write fee description. --------- Co-authored-by: Siddharth Suresh --- core/cap-0046-07.md | 504 +++++++++++++++++++++++++++----------------- core/cap-0046.md | 2 +- 2 files changed, 314 insertions(+), 192 deletions(-) diff --git a/core/cap-0046-07.md b/core/cap-0046-07.md index e6e9509d0..28798f52e 100644 --- a/core/cap-0046-07.md +++ b/core/cap-0046-07.md @@ -1,6 +1,6 @@ ``` CAP: 0046-07 (formerly 0055) -Title: Fee model in smart contracts +Title: Fee and resource model in smart contracts Working Group: Owner: MonsieurNicolas Authors: dmkozh @@ -15,10 +15,6 @@ Protocol version: TBD This CAP defines the mechanism used to determine fees when using smart contracts on the Stellar network. -## Working Group - -TBD - ## Motivation With the introduction of smart contracts on the network, the existing fee model of the "classic" transaction system is too simplistic: it requires careful design of the code that runs "on chain" as to ensure that all operations have a similar cost and performance profile, which is not possible with arbitrary code running in contracts. @@ -40,272 +36,388 @@ The fee structure is designed to discourage "spam" traffic and overall waste of ### XDR changes -The following network parameters are introduced (in some cases increments are used to mitigate for rounding errors): +See the full XDR diffs in the Soroban overview CAP. + +Fee and resource limit configuration is specified via the following network parameters (in some cases increments are used to mitigate for rounding errors): ``` -// general “smart contract execution lane” settings -struct ContractExecutionLanesSettingsV0 +// General “Soroban execution lane” settings +struct ConfigSettingContractExecutionLanesV0 { - uint32 ledgerMaxTxCount; // maximum number of “smart” transactions per ledger + // maximum number of Soroban transactions per ledger + uint32 ledgerMaxTxCount; }; -// instruction count aka "compute" settings -struct ContractInstructionsNetworkSettingsV0 +// "Compute" settings for contracts (instructions and memory). +struct ConfigSettingContractComputeV0 { - int64 ledgerMaxInstructions; // maximum instructions per ledger - int64 txMaxInstructions; // maximum instructions per transaction - int64 feeRatePerInstructionsIncrement; // cost of INSTRUCTIONS_INCREMENT=10k instructions + // Maximum instructions per ledger + int64 ledgerMaxInstructions; + // Maximum instructions per transaction + int64 txMaxInstructions; + // Cost of 10000 instructions + int64 feeRatePerInstructionsIncrement; + + // Memory limit per transaction. Unlike instructions, there is no fee + // for memory, just the limit. + uint32 txMemoryLimit; }; -// Ledger access settings -struct ContractLedgerCostNetworkSettingsV0 +// Ledger access settings for contracts. +struct ConfigSettingContractLedgerCostV0 { - uint32 ledgerMaxReadLedgerEntries;// maximum number of ledger entry read operations per ledger - uint32 ledgerMaxReadBytes; // maximum number of bytes that can be read per ledger - uint32 ledgerMaxWriteLedgerEntries;// maximum number of ledger entry write operations per ledger - uint32 ledgerMaxWriteBytes; // maximum number of bytes that can be written per ledger - - uint32 txMaxReadLedgerEntries;// maximum number of ledger entry read operations per transaction - uint32 txMaxReadBytes; // maximum number of bytes that can be read per transaction - uint32 txMaxWriteLedgerEntries;// maximum number of ledger entry write operations per transaction - uint32 txMaxWriteBytes; // maximum number of bytes that can be written per transaction - - - int64 feeReadLedgerEntry; // fee per ledger entry read - int64 feeWriteLedgerEntry; // fee per ledger entry write - - int64 feeRead1KB; // fee for reading 1KB - int64 feeWrite1KB; // fee for writing 1KB - - int64 bucketListSizeBytes; // bucket list fees grow slowly up to that size - int64 bucketListFeeRateLow; // fee rate in stroops when the bucket list is empty - int64 bucketListFeeRateHigh; // fee rate in stroops when the bucket list reached bucketListSizeBytes - uint32 bucketListGrowthFactor; // rate multiplier for any additional data passed the first bucketListSizeBytes + // Maximum number of ledger entry read operations per ledger + uint32 ledgerMaxReadLedgerEntries; + // Maximum number of bytes that can be read per ledger + uint32 ledgerMaxReadBytes; + // Maximum number of ledger entry write operations per ledger + uint32 ledgerMaxWriteLedgerEntries; + // Maximum number of bytes that can be written per ledger + uint32 ledgerMaxWriteBytes; + + // Maximum number of ledger entry read operations per transaction + uint32 txMaxReadLedgerEntries; + // Maximum number of bytes that can be read per transaction + uint32 txMaxReadBytes; + // Maximum number of ledger entry write operations per transaction + uint32 txMaxWriteLedgerEntries; + // Maximum number of bytes that can be written per transaction + uint32 txMaxWriteBytes; + + int64 feeReadLedgerEntry; // Fee per ledger entry read + int64 feeWriteLedgerEntry; // Fee per ledger entry write + + int64 feeRead1KB; // Fee for reading 1KB + + // The following parameters determine the write fee per 1KB. + // Write fee grows linearly until bucket list reaches this size + int64 bucketListTargetSizeBytes; + // Fee per 1KB write when the bucket list is empty + int64 writeFee1KBBucketListLow; + // Fee per 1KB write when the bucket list has reached `bucketListTargetSizeBytes` + int64 writeFee1KBBucketListHigh; + // Write fee multiplier for any additional data past the first `bucketListTargetSizeBytes` + uint32 bucketListWriteFeeGrowthFactor; }; -// historical data (pushed to core archives) settings -struct ContractHistoricalNetworkSettingsV0 +// Historical data (pushed to core archives) settings for contracts. +struct ConfigSettingContractHistoricalDataV0 { - int64 feeHistorical1KB; // fee for storing 1KB in archives + int64 feeHistorical1KB; // Fee for storing 1KB in archives }; -// Meta data (pushed to downstream systems) settings -struct ContractMetaDataNetworkSettingsV0 +// Contract event-related settings. +struct ConfigSettingContractEventsV0 { - uint32 txMaxExtendedMetaDataSizeBytes; // maximum size of extended meta data produced by a transaction - int64 feeExtendedMetaData1KB; // fee for generating 1KB of extended meta data + // Maximum size of events that a contract call can emit. + uint32 txMaxContractEventsSizeBytes; + // Fee for generating 1KB of contract events. + int64 feeContractEvents1KB; }; -// Bandwidth related data settings -struct ContractBandwidthDataNetworkSettingsV0 +// Bandwidth related data settings for contracts. +// We consider bandwidth to only be consumed by the transaction envelopes, hence +// this concerns only transaction sizes. +struct ConfigSettingContractBandwidthV0 { - uint32 ledgerMaxPropagateSizeBytes; // maximum size in bytes to propagate per ledger - uint32 txMaxSizeBytes; // maximum size in bytes for a transaction + // Maximum sum of all transaction sizes in the ledger in bytes + uint32 ledgerMaxTxsSizeBytes; + // Maximum size in bytes for a transaction + uint32 txMaxSizeBytes; - int64 feePropagateData1KB; // fee for propagating 1KB of data + // Fee for 1 KB of transaction size + int64 feeTxSize1KB; }; - ``` -Additional changes at the Transaction/TransactionSet level: +Soroban resources are provided in a `SorobanTransactionData` extension of +transaction: + +``` +// Resource limits for a Soroban transaction. +// The transaction will fail if it exceeds any of these limits. +struct SorobanResources +{ + // The ledger footprint of the transaction. + LedgerFootprint footprint; + // The maximum number of instructions this transaction can use + uint32 instructions; + + // The maximum number of bytes this transaction can read from ledger + uint32 readBytes; + // The maximum number of bytes this transaction can write to ledger + uint32 writeBytes; +}; +// The transaction extension for Soroban. +struct SorobanTransactionData +{ + ExtensionPoint ext; + SorobanResources resources; + // Portion of transaction `fee` allocated to refundable fees. + int64 refundableFee; +}; ``` -// Transaction changes (actual diff TBD) - int64 fee; // total fee for this transaction - int64 refundableFee; // portion of `fee` allocated to other refundable fees +### Semantics -// Additional properties: - uint32 instructions; // how many instructions are needed for this tx +#### Fee model overview - uint32 readBytes; // how many bytes will be read by this tx - uint32 writeBytes; // how many bytes will be written by this tx +The approach taken in this proposal is to decompose the total transaction fee into the following additive components: +* `resourcesFee` - the fee for 'competitive' network resources (defined below) and non-refundable resources, based on the values *declared* in transaction and network-defined fee rates. +* `refundableResourcesFee` - the maximum fee for resources that don't need to be strictly restricted per ledger and thus are charged based on the actual usage. +* `inclusionFeeBid` - this is the "social value" part of the fee, it represents the intrinsic value that the submitter puts on that transaction. - uint32 extendedMetaDataSizeBytes; // how many bytes can be added to the meta by this tx - -// TransactionEnvelope changes +The 'competitive' resources are resources that have to be limited per ledger in order to ensure reasonable close time and prevent network from overloading. These resources are bounded on different dimensions, i.e. there is no single 'proxy' resource that could be used to restrict them. On a high level, these resources are: +* instructions (virtual CPU instructions to execute) +* ledger data access (ledger IO metrics) +* network propagation (bandwidth usage) -// SCP value: hash of GeneralizedTransactionSet before and after removing payload +Soroban transaction fee has to cover all three components, but only `inclusionFeeBid` is used for transaction prioritization. +#### TransactionSet semantics -// GeneralizedTransactionSet changes TBD, need a way to express different fee markets +All Soroban transactions must be present in phase `1` of `GeneralizedTransactionSet` (all the remaining 'classic' transactions must be in phase `0`). The Soroban phase must contain only a single `TXSET_COMP_TXS_MAYBE_DISCOUNTED_FEE` component. Refer to [`CAP-0042`](./cap-0042.md) for details on `GeneralizedTransactionSet` and phases. -case TXSET_COMP_SMART_TXS_MAYBE_DISCOUNTED_FEE: - struct - { - int64* inclusionFee; - TransactionEnvelope txs<>; - } smartTxsMaybeDiscountedFee; +While transactions bid specific `inclusionFeeBid`, the effective bid may be lowered within a transaction set component by setting `baseFee` in `txsMaybeDiscountedFee` component. -// Other constants +When set: +* all transactions within the component must bid not less than `baseFee`, i.e. for each transaction `inclusionFeeBid >= baseFee` +* the effective inclusion bid for transactions in that group is `baseFee` - const TX_BASE_RESULT_SIZE = 300; // the approximation to use for `TransactionResult` when pushing to archive -``` +The total resource consumption for every one the 'competitive' resources must not exceed the ledger-wide limits. The specific limits are specified in sections below on per-resource basis. -Changes to ledger data: -``` +The usual `GeneralizedTransactionSet` validity and comparison rules also apply to Soroban corresponding to the semantics described in [CAP-0042](./cap-0042.md). -TBD: `LedgerEntry` with expiration date +#### Transaction validation -``` +All Soroban transactions must have `ext.sorobanData()` extension present and populated. -### Semantics +`resources` contain the declared values of resources that the transaction is paying the fee for. These values have to not exceed the limits specified by the network settings. -Validity constraints: -* source account must be able to pay for the total fee bid `tx.fee` for that transaction. -* the number of smart contract transactions cannot exceed `ledgerMaxTxCount` per ledger +`resourceFee` is computed based on the `resources` declared in `tx` and transaction envelope size: -#### Resources with contention and inclusion fees +`resourceFee(tx) = Instructions_fee(resources.instructions) + LedgerDataAccess_fee(resources) + NetworkData_fee(size(txEnvelope)) + Historical_flat_fee(size(txEnvelope))` -The approach taken in this proposal is to decompose fees into: -* `resourcesFee` - derived from how much resources a transaction is using and network parameters that evolve with time as to price those resources appropriately -* `inclusionFeeBid` - this is the "social value" part of the fee, it represents the intrinsic value that the submitter puts on that transaction. +Note, that `Historical_flat_fee` is a 'competitive' resource, but it's constant for any transaction execution result and thus is a part of non-refundable fee (as its refund is always 0). -The resources where we allow competition are: -* instructions (virtual instructions to execute) -* ledger data access (bytes transferred) -* network propagation (bandwidth) +`refundableFee` corresponds to the `refundableResourcesFee` component. -Transactions will have to pay both for `resourcesFee` and `inclusionFeeBid`, with `inclusionFeeBid` used to prioritize transactions relative to each other (both when flooding and for inclusion in transaction sets). +The rules for limits and fee computation per-resource are specified in dedicated sections below. -#### TransactionSet semantics +At validation time total transaction fee (`tx.fee`) has to cover the fee components based only on the values declared in transaction: -While transactions bid specific `inclusionFeeBid`, the effective bid may be lowered within a transaction set component by setting `inclusionFee`. +`tx.fee = resourceFee(tx) + sorobanData.refundableFee + inclusionFeeBid` -When set: -* all transactions within the component must bid more than `inclusionFee`, ie for each transaction `inclusionFeeBid >= inclusionFee` -* the effective inclusion bid for transactions in that group is `inclusionFee` +Minimum valid `inclusionFeeBid` value is 100 stroops, thus the following condition has to be true: + +`tx.fee >= resourceFee(tx) + sorobanData.refundableFee + 100` + +Similarly to 'classic' transactions, source account must be able to pay for the total fee (`tx.fee`) for the transaction. #### Fee computation while applying transactions As in classic, total fees are taken from the source account balance before applying transactions. -Note that the total fee charged is equal to -`tx.fee + inclusionFee - inclusionFeeBid` as to accomodate for potential inclusion fee discounts. +Total fee charged is equal to `tx.fee` if `baseFee` is not set in the transaction set component, and `tx.fee - inclusionFeeBid + baseFee` if `baseFee` is set in the transaction set component. + +During transaction execution the resource limits declared by transaction are enforced and exceeding any one of the limits leads to transaction failure with `_RESOURCE_LIMIT_EXCEEDED` operation error code (every Soroban operation defines a separate error for this, such as `INVOKE_HOST_FUNCTION_RESOURCE_LIMIT_EXCEEDED`). + +The per-resource failure conditions are specified in the sections below. + +At the end of the transaction execution, compute the final refundable fee for successful transaction as follows: + +`effectiveRefundableFee = Events_fee(emittedContractEventsSizeBytes) + Rent_fee` + +where `emittedContractEventsSizeBytes` is the size of the emitted contract events and invocation return value, and `Rent_fee` is the fee for the rent bumps performed by the transaction (if any). If `emittedContractEventsSizeBytes > sorobanData.refundableFee`, the transaction fails. -At the end of the transaction execution, refund the source account for resources that are eligible for a refund (this refund is reflected under `txChangesAfter` in the meta). +In case if transaction fails `effectiveRefundableFee` is set to `0`. -Right now the following resources fall under this category: - * extended meta data +After executing the transaction, the refund amount is computed as `sorobanData.refundableFee - effectiveRefundableFee` and refund that amount (when non-zero) to the transaction source account. The ledger modification due to refund is reflected under `txChangesAfter` in the meta. -The total fee `tx.fee`, `tx.refundableFee` and other fees are connected in the following way: +Note, that refund happens for the failed transactions as well. -`tx.fee = resourcesFee(tx) + inclusionFeeBid + tx.refundableFee` +#### Per-resource specifications -with -` -resourcesFee(tx) = Instructions_fee(tx) + LedgerDataAccess_fee(tx) + - NetworkData_fee(tx) + historical_flat_fee(txEnvelope) -` +This section describes the fee contributions, per-transaction/per-ledger maximum limits and apply-time enforcement for all the transaction resources. -#### Execution time +#### Instructions + +Instructions bound the execution time of the transactions in the ledger. A transaction contains: -* how many "Instructions" they want to bid for. `uint32 Tx.instructions`. +* maximum number of CPU instructions that transaction may use `sorobanData.resources.instructions` + +All the configuration values come from `ConfigSettingContractComputeV0`. -`Instructions_fee(Tx) = round_up(Tx.instructions*feeRatePerInstructionsIncrement/INSTRUCTIONS_INCREMENT)` +Fee: `Instructions_fee(instructions) = round_up(instructions * feeRatePerInstructionsIncrement / 10000)` Validity constraints: * per transaction - * `Tx.instructions <= txMaxInstructions`. + * `resources.instructions <= txMaxInstructions`. * ledger wide (`GeneralizedTransactionSet`) - * sum of all `tx.instructions` <= `ledgerMaxInstructions`. + * sum of all `resources.instructions` <= `ledgerMaxInstructions`. + +Apply-time enforcement: instructions metered during the contract execution may not exceed `instructions` declared in the transaction. Refer to [CAP-0046-10](./cap-0046-10.md) for metering details. #### Ledger data +Ledger data resources bounds the amount and size of ledger reads and writes. + A transaction contains: -* the read `tx.LedgerFootprint.readOnly` and read/write `tx.LedgerFootprint.readWrite` sets (ledger keys). -* the maximum total amount of data `uint32 tx.readBytes` that gets read by `tx.LedgerFootprint.readOnly` and `tx.LedgerFootprint.readWrite`. -* the maximum total amount of data `uint32 tx.writeBytes` that can be written by `tx.LedgerFootprint.readWrite`. +* the read `sorobanData.resources.footprint.readOnly` and read/write `sorobanData.resources.readWrite` sets of ledger keys. +* the maximum total amount of data that can be read from the ledger in bytes `sorobanData.resources.readBytes` +* the maximum total amount of data that can be written to the ledger in bytes `sorobanData.resources.writeBytes` +All the configuration values come from `ConfigSettingContractLedgerCostV0`. +Fee: ``` -LedgerDataAccess_fee(tx) = - (length(tx.LedgerFootprint.readOnly)+length(tx.LedgerFootprint.readWrite))*feeReadLedgerEntry + // cost of reading ledger entries - length(tx.LedgerFootprint.readWrite)*feeWriteLedgerEntry + // cost of writing ledger entries - round_up(tx.readBytes * feeRead1KB / 1024) + // cost of processing reads - round_up(wfee_rate(lcl.BucketListSize)* tx.writeBytes)) // cost of adding to the bucket list +LedgerDataAccess_fee(resources) = + (length(resources.footprint.readOnly)+length(resources.footprint.readWrite))*feeReadLedgerEntry + // cost of reading ledger entries + length(resources.footprint.readWrite)*feeWriteLedgerEntry + // cost of writing ledger entries + round_up(resources.readBytes * feeRead1KB / 1024) + // cost of processing reads + round_up(write_fee_per_1kb(BucketListSize)* resources.writeBytes / 1024) // cost of adding to the bucket list ``` -With +where `BucketListSize` is the average size of the bucket list over the moving window. Refer to the State Expiration CAP for details (TODO), and `write_fee_per_1kb` is a function that determines the ledger write fee per 1024 bytes based on the bucket list size and is defined as follows: ``` -wfee_rate(s) = (bucketListFeeRateHigh - bucketListFeeRateLow)*s/bucketListSizeBytes + -bucketListFeeRateLow + -(if s > bucketListSizeBytes, - bucketListGrowthFactor* - (bucketListFeeRateHigh - bucketListFeeRateLow)* - (s-bucketListSizeBytes)/bucketListSizeBytes, - 0) +write_fee_per_1kb(s) = (writeFee1KBBucketListHigh - writeFee1KBBucketListLow)*s/bucketListTargetSizeBytes + + writeFee1KBBucketListLow + + (if s > bucketListTargetSizeBytes, + bucketListWriteFeeGrowthFactor* + (writeFee1KBBucketListHigh - writeFee1KBBucketListLow)* + (s-bucketListTargetSizeBytes)/bucketListTargetSizeBytes, + 0) ``` -__Open:__ `wfee_rate` (and possibly `LedgerDataAccess_fee`) needs to be reconciled with "rent" (not finalized at the time of this writing) as "paying rent" is very similar to adding an entry to the ledger. The difference is that adding an entry to the ledger adds the entry to the "topmost" bucket, where as paying rent is adding an entry to a bucket while merging. - Validity constraints: * per transaction - * `length(tx.LedgerFootprint.readOnly) <= txMaxReadLedgerEntries`. - * `tx.readBytes <= txMaxReadBytes`. - * `length(tx.LedgerFootprint.readWrite) <= txMaxWriteLedgerEntries`. - * `tx.writeBytes <= txMaxWriteBytes`. + * `length(resources.footprint.readOnly) + length(resources.footprint.readWrite) <= txMaxReadLedgerEntries`. + * `resources.readBytes <= txMaxReadBytes`. + * `length(resources.footprint.readWrite) <= txMaxWriteLedgerEntries`. + * `resources.writeBytes <= txMaxWriteBytes`. * ledger wide (`GeneralizedTransactionSet`) - * `sum(length(tx.LedgerFootprint.readOnly) + length(tx.LedgerFootprint.readWrite)) <= ledgerMaxReadLedgerEntries`. + * `sum(length(resources.footprint.readOnly) + length(resources.footprint.readWrite)) <= ledgerMaxReadLedgerEntries`. -#### Historical storage +Apply-time enforcement: -`historical_flat_fee(txEnvelope) = round_up((size(txEnvelope)+TX_BASE_RESULT_SIZE) * feeHistorical1KB / 1024)` +* Before executing the transaction logic all the entries in the footprint (both read-only and read-write) are read from the ledger and the total read size is computed by adding the size of the key and size of the entry read (if any) to the total value. If total read size exceeds `resources.readBytes`, transaction fails. +* During the host function execution any read/write of a ledger key outside of the footprint (or write of a read-only entry) leads immediately to a transaction failure. +* After the execution the total size of the writes is computed by adding sizes of the keys and values of the non-removed entries. If the total write size exceeds `resources.writeBytes`, transaction fails. Entry deletion is 'free' and not counted towards the total write size. -Where `TX_BASE_RESULT_SIZE` is a constant approximating the size in bytes of transaction results published to archives. +#### Bandwidth related -Validity constraints: -_None_ +Bandwidth utilization is bounded by the total size of the transactions flooded and included to the ledger. -#### Extended meta data - -__open:__ we could consider removing `tx.extendedMetaDataSizeBytes` and instead just use `txMaxExtendedMetaDataSizeBytes`. -There are a couple potential problems: -* this may artifically increase the minimum account balance required to submit a transaction -* transactions may fail if the network votes to increase `txMaxExtendedMetaDataSizeBytes` (this problem may have to be solved in general anyways) +All the configuration values come from `ConfigSettingContractBandwidthV0`. A transaction contains: -* `uint32 tx.extendedMetaDataSizeBytes` the maximum size of extended data produced by this transaction +* implicitly, its impact in terms of bandwidth utilization, the size (in bytes) of the `TransactionEnvelope` -`extendedMetaData_flat_fee(tx) = round_up(tx.extendedMetaDataSizeBytes * feeExtendedData1KB / 1024)` +Fee: `NetworkData_fee(txEnvelope) = round_up(size(txEnvelope) * feeTxSize1KB / 1024)` Validity constraints: * per transaction - * `tx.extendedMetaDataSizeBytes <= txMaxExtendedMetaDataSizeBytes` + * `size(txEnvelope) <= txMaxSizeBytes` +* ledger wide + * sum of all `size(txEnvelope)` <= `ledgerMaxTxsSizeBytes`. -#### Bandwidth related +Apply-time enforcement: _None_ -A transaction contains: -* implicitely, its impact in terms of bandwidth utilization, the size (in bytes) of the `TransactionEnvelope` +#### Historical storage -`NetworkData_fee(tx) = round_up(size(txEnvelope) * feePropagateData1KB / 1024)` +Historical storage is utilized for any transaction result and hence the fee has to be paid unconditionally. The fee depends on `TransactionEnvelope` size. -Validity constraints: -* per transaction - * `size(txEnvelope) <= txMaxTxSizeBytes` -* ledger wide - * sum of all `size(txEnvelope)` <= `ledgerMaxPropagateSizeBytes`. +All the configuration values come from `ConfigSettingContractHistoricalDataV0`. -#### Refundable resource fee +Fee: `Historical_flat_fee(txEnvelope) = round_up((size(txEnvelope)+TX_BASE_RESULT_SIZE) * feeHistorical1KB / 1024)` -A transaction contains: -* `int64 tx.refundableFee` fee shared by all “flat fee” resources +Where `TX_BASE_RESULT_SIZE` is a constant approximating the size in bytes of transaction results published to archives and is set to `300`. -`tx.refundableFee` must be greater than -`extendedMetaData_flat_fee(tx)` +Validity constraints: _None_ -## "Fee bump" and failing transactions +Apply-time enforcement: _None_ + +#### Contract events and return value + +Contract events are a 'side' output of the transaction that is written to metadata and not to ledger. Invocation return value has the same properties and thus is included into this as well. + +Note, that ledger changes are also emitted in metadata for transaction, but their size is bounded by proxy with ledger access limits and we can consider write fees to also cover metadata writes as well. + +All the configuration values come from `ConfigSettingContractEventsV0`. + +Fee: `Events_fee(eventsBytes) = round_up(eventsBytes * feeContractEvents1KB / 1024)` + +Validity constraints: _None_ + +Apply-time enforcement: +* compute the consumed events size as the sum of events emitted during the host function invocation and its return value. If total size exceeds `ConfigSettingContractEventsV0.txMaxContractEventsSizeBytes`, the transaction fails + +#### Rent fee + +Rent fee has to be paid if operation increases the lifetime of the ledger entries and/or increases entry size. + +Rent fee is computed only at transaction application time and it depends on the state of the ledger entries before and after the transaction has been applied. + +Fee: `Rent_fee = sum(rent_fee_per_entry_change(entry_before, entry_after)) + expiration_write_fee` for all the ledger entry changes. + +Entry rent fee consists of two components: fee for renting new ledgers with the new entry size and fee for renting the old ledgers with increased size. If `entry_before` does not exist, we treat its size as `0` and expiration ledger as `0` for the sake of this formula. + +``` +rent_fee_per_entry_change(entry_before_entry_after) = + if (entry_after.expiration_ledger > entry_before.expiration_ledger, + rent_fee_for_size_and_ledgers( + entry_after.is_persistent, + size(entry_after), + new_expiration_ledger - max(entry_before.expiration_ledger, current_ledger - 1)), + 0) + + if (exists(entry_before) && size(entry_after) > size(entry_before), + rent_fee_for_size_and_ledgers( + entry_after.is_persistent, + size(entry_after) - size(entry_before), + entry_before.expiration_ledger - current_ledger + 1), + 0) +``` + +`rent_fee_for_size_and_ledgers` is the main rent primitive that computes the fee for renting `S` bytes of ledger space for the period of `L` ledgers: + +``` +rent_fee_for_size_and_ledgers(is_persistent, S, L) = round_up( + S * L * write_fee_per_1kb(BucketListSize) / + (1024 * + if (is_persistent, persistentRentRateDenominator, tempRentRateDenominator)) +) +``` + +Settings values come from `StateExpirationSettings`. + +Additionally, we charge for the `ExpirationEntry` writes of entries that had `expirationLedger` changed using the same rate as for any other entry write: + +``` +expiration_write_fee = + num_expiration_updates * feeWriteLedgerEntry + + round_up(write_fee_per_1kb(BucketListSize) * EXPIRATION_ENTRY_SIZE / 1024) +``` + +where `num_expiration_updates` is the number of ledger entries that had expiration ledger updated and `EXPIRATION_ENTRY_SIZE` is size of `ExpirationEntry` with its key and is set to `68` bytes. -__Open:__ -The high number of settings involved in making a transaction succeed at runtime may cause additional usability issues. +Validity constraints: _None_ -Transactions may fail due to variance in behavior (similar to how footprints can change): underestimating any of the transaction level field may cause the transaction to be included in a ledger, but fail later on (causing the sequence number to be consumed and fees burned). +Apply-time enforcement: _None_ -There might be a need for a "fee bump" wrapper transaction of sorts that takes the burden in case of failure, and that allows to override the problematic fields (including footprints and any limit and fee related fields). +#### Operations -Unlike "fee bump", the outer "bump" transaction would have its own sequence number/fees. +Every Soroban transaction must contain exactly 1 operation. There is no fee for operations, but there is a ledger-wide limit on transactions (and thus operations) defined by `ConfigSettingContractExecutionLanesV0.ledgerMaxTxCount`. + +## "Fee bump" and failing transactions + +Soroban transactions are compatible with the 'fee bump' transactions, so the total transaction fee can be increased in order to account for the higher network contention or even increase resource fees (as both resource fee and inclusion fee are a part of `tx.fee`). + +Soroban transactions might also fail at apply time due to too low declared resource values or too low refundable fee. The transaction sequence number will be consumed and the fees will be withdrawn. This may provide bad user experience in case if it's hard to obtain a new signature for the transaction. + +We don't provide any built-in way for re-using the failed transactions in the first version of Soroban. However, the user experience can be significantly improved by decoupling the transaction signature from the signatures used for the host function invocation itself, specifically by using the Soroban Authorization Framework ([CAP-0046-11](./cap-0046-11.md)). If all the signatures are decoupled, then any party can pay the transaction fees and sign new transactions in case of failure. Nonces will only be consumed on transaction success, so the signatures can be re-used as many times as needed until the transaction succeeds. ## Design Rationale @@ -316,7 +428,6 @@ This proposal relies heavily on the existence of a "preflight" mechanism to dete Additional logic (not covered in this CAP), will be needed to determine the market rate of resources based for example on historical data (see below). - ### Resources Fees are used to ensure fair and balanced utilization of resources. @@ -338,7 +449,7 @@ Validators are expected to vote regularly (once a quarter for example) to ensure [CAP-0046: WebAssembly Smart Contract Runtime Environment](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046-01.md) introduces the notion of virtual instructions. In the context of this CAP, the only thing that matters is that an "instruction" represents an arbitrary base unit for "execution time". -As a consequence, the "goal" for validators is to construct a `GeneralizedTransactionSet` that uses up to `lcl.ContractNetworkSettingsV0.ledgerMaxInstructions`. +As a consequence, the "goal" for validators is to construct a `GeneralizedTransactionSet` that uses up to `lcl.ConfigSettingContractComputeV0.ledgerMaxInstructions`. #### Ledger data @@ -358,15 +469,13 @@ The cost of a "ledger entry read" is fairly open ended, and depends on many vari That "base cost" is defined by validators as `feeReadLedgerEntry`. This proposal does not let transactions compete directly on the number of ledger entry read operations, therefore the cost of a read operation is `feeReadLedgerEntry` (validators must still construct transaction sets that keep the number of reads below a maximum). -Market dynamics are limited to the number of bytes (see below). - Transactions contain the total number of bytes that they will read from the bucket list as well at a fee bid for reading those bytes. The number of bytes read corresponds to the size of the latest `BucketEntry` for that ledger entry (and does not take into account the possibility that an implementation may read stale entries in buckets or may have to read other entries from a bucket). The fee is determined based on the rate `feeRead1KB` expressed for reading 1 KB (1024 bytes) worth of data. -As transactions compete for the total read capacity `ledgerMaxReadBytes` for a given ledger, the effective fee goes up. +As transactions compete for the total read capacity `ledgerMaxReadBytes` for a given ledger, the inclusion fee goes up. ##### Write traffic and ledger size @@ -419,30 +528,30 @@ As a consequence the final formula looks like this: `fee(b) = round_up(b*fee_rate(s))` With -`fee_rate(s) = (feeRateM - feeRate)*s/M_base + feeRate + (if s > M_base, exp(K*(s-M_base)/B_buffer), 0)` +`fee_rate(s) = (feeRateM - feeRate)*s/M_base + feeRate + if (s > M_base, exp(K*(s-M_base)/B_buffer), 0)` -We can simplify this even further by replacing the exponential component by a steep linear slope that causes fees to be "extremely high" at `M_buffer`. +We can simplify this even further by replacing the exponential component by a steep linear slope that causes fees to be "extremely high" at `M_buffer`, which turns the formula into what is specified above: -##### Putting it together - -"read/write" operations need to first read data before writing it. The amount of data written back can be larger or smaller than what was read, as consequence: -* The number of ledger entry reads is the size of ledger entries referenced in ledger footprints (both read and read/write). -* The number of bytes to read is the size of bucket entries from both the read and read/write footprints. -* The number of bytes to write is the number of bytes associated with bucket entries referenced by the readWrite footprint. -* The number of ledger entry to write is the size of the read/write footprint. +`fee_rate(s) = (feeRateM - feeRate)*s/M_base + feeRate + if (s > M_base, K*(s-M_base)/B_buffer, 0)` -##### Effective fee and flooding +where `K >= 1`. -Having the fee model depend on ledger size creates some complication when trying to reason about multiple transactions getting applied: a naive solution would just try to follow the price curve exactly, causing fees to evolve on a per transaction basis. -This would create incentives to front-run transactions within the same transaction set, and would also make it hard to decide if a transaction should get flooded. +##### Ledger size averaging -In the context of this proposal, we make the following observation: assuming that validators pick a small upper bound for the total number of bytes that can be added to the bucket list per ledger relative to the bucket list size, we can then assume that the price of storage varies marginally per ledger (ie the price paid by the first transaction is not that different from the last transaction in a ledger), and that the reference price still "resets" to the proper price every ledger (as it's based on the size of the bucket list). +Tracking the ledger size for every ledger introduces unnecessary noise that leads to the following issues: +* flooding might be somewhat imprecise due to fees changing every ledger with a risk of transaction becoming invalid +* wrong incentives, such as trying to pay the rent for a long time period right after the bucket list merge ledger +* fee estimations are harder for the clients -As a consequence, we can just price transactions independently, based on the bucket list from the last closed ledger, and allocation within a transaction (for multiple bytes) is a flat rate as well. +To alleviate all of these issues, instead of using the current ledger size, this proposal uses the average of the ledger size over the sliding window, that is large enough to average out most of the noise coming from short-term merges and rather representing the ledger size change trends rather than actual size at any moment. -Flooding transactions for the next ledger in that context is straightforward as the only factor to take into account is bucket list size (determined based on the last closed ledger). +##### Putting it together -Note that transactions can still be invalidated more than a ledger in the future, as a consequence validators may apply a certain level of "padding" when computing the fee required before flooding those transactions. +"read/write" operations need to first read data before writing it. The amount of data written back can be larger or smaller than what was read, as consequence: +* The number of ledger entry reads is the size of ledger entries referenced in ledger footprints (both read and read/write). +* The number of bytes to read is the size of bucket entries from both the read and read/write footprints. +* The number of bytes to write is the number of bytes associated with bucket entries referenced by the readWrite footprint. +* The number of ledger entries to write is the size of the read/write footprint. ##### Ledger size reduction @@ -471,7 +580,7 @@ The model retained in the context of this CAP is to just have the validators set ##### Transaction Result -In order to reduce the base cost of transactions, the "result" published to archive is fixed size and the actual detailed transaction result is now emitted in the meta. See [CAP-0046: Smart Contract Events](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046-08.md) for more details. +In order to reduce the base cost of transactions, the "result" published to archive is fixed size and the actual detailed transaction result is emitted in the meta and accounted for in the same way as contract events. See [CAP-0046: Smart Contract Events](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046-08.md) for more details. #### Extended meta data @@ -485,7 +594,6 @@ Fees are needed to control for the overhead in those systems. The model retained in this CAP is a flat rate per byte model for simplicity. It is expected that this fee would be orders of magnitude smaller than what is needed to persist data on chain. - #### Bandwidth Transactions need to be propagated to peers on the network. @@ -560,11 +668,25 @@ A subsequent CAP may update the fee model for the existing classic transaction s ### Resource Utilization +There are no significant resource utilization changes compared to the classic fee model. + ## Security Concerns +The resource fees and limits are introduced to maintain network health and therefore the all the risks are around the network liveness and DOS possibility, but not necessarily security. + +Incorrect configuration or incorrect enforcement calibration might lead to high ledger close times or spam. ## Test Cases + +The fees are covered in most of the Soroban-related test cases. + ## Implementation +[TransactionFrame::validateSorobanResources](https://github.com/stellar/stellar-core/blob/0df2e0c6f80d2c461870e837fbe50fa16f9048f3/src/transactions/TransactionFrame.cpp#L588) enforces the limts at transaction validation time. + + +[InvokeHostFunctionOpFrame::doApply](https://github.com/stellar/stellar-core/blob/0df2e0c6f80d2c461870e837fbe50fa16f9048f3/src/transactions/InvokeHostFunctionOpFrame.cpp#L379) performs most of the apply-time resource limit enforcement. + +[`fees.rs`][https://github.com/stellar/rs-soroban-env/blob/d92944576e2301c9866215efcdc4bbd24a5f3981/soroban-env-host/src/fees.rs] file of Soroban host contains all the fee computation logic specified here. diff --git a/core/cap-0046.md b/core/cap-0046.md index d72d6b4be..55f3d4ffb 100644 --- a/core/cap-0046.md +++ b/core/cap-0046.md @@ -91,7 +91,7 @@ the following sub-CAPs: - [CAP-0046-06 (ex-0054) - Smart Contract Standardized Asset](./cap-0046-06.md) covers the built-in token contract, that can also "wrap" existing Stellar assets. - - [CAP-0046-07 (ex-0055) - Fee Model in Smart Contracts](./cap-0046-07.md) + - [CAP-0046-07 (ex-0055) - Fee and Resource Model in Smart Contracts](./cap-0046-07.md) covers changes to the network's fee-charging system to account for smart contracts. - [CAP-0046-08 (ex-0056) - Smart Contract Logging](./cap-0046-08.md) covers From f4d056e39f9a19f12eba1d9ef239f27c39d2488a Mon Sep 17 00:00:00 2001 From: Jay Geng Date: Fri, 25 Aug 2023 14:33:16 -0400 Subject: [PATCH 5/7] Update the host function cap (0046-03) (#1378) * Update the host function cap (0046-03) * Update error handling section of host function cap --- core/cap-0046-03.md | 772 ++++++++++++++++++++++---------------------- core/cap-0046-10.md | 6 +- 2 files changed, 393 insertions(+), 385 deletions(-) diff --git a/core/cap-0046-03.md b/core/cap-0046-03.md index fd5ea5784..77957cd22 100644 --- a/core/cap-0046-03.md +++ b/core/cap-0046-03.md @@ -4,9 +4,9 @@ CAP: 0046-03 (formerly 0051) Title: Smart Contract Host Functions Working Group: - Owner: Jay Geng <@jayz22> - Authors: TBD - Consulted: Graydon Hoare <@graydon>, Leigh McCulloch <@leighmcculloch>, Nicolas Barry <@MonsieurNicolas>, Siddharth Suresh <@sisuresh> + Owner: Jay Geng <@jayz22>, Graydon Hoare <@graydon> + Authors: Jay Geng <@jayz22> + Consulted: Leigh McCulloch <@leighmcculloch>, Nicolas Barry <@MonsieurNicolas>, Siddharth Suresh <@sisuresh> Status: Draft Created: 2022-05-20 Discussion: TBD @@ -14,525 +14,533 @@ Protocol version: TBD ``` ## Simple Summary -This CAP proposes a set of host functions — interface between the host environment running on the Stellar Core and the WebAssembly-based (WASM) virtual machine running smart contracts, as well as expands the host object repertoire on which those host functions operate. This CAP also lays out a framework for resource accounting and gas metering on the smart contracts. +This CAP proposes a set of host functions — interface between the host environment running on the Stellar Core and the WebAssembly-based (WASM) virtual machine running smart contracts. ## Motivation amd Goals Alignment See the Soroban overview CAP. ## Abstract -This CAP specifies the signatures of host functions that serve as the host-VM interface, divided into logical modules. It then introduces new host objects, their XDR signature, and new semantics of their conversion and comparison. The selection criteria of the host functions and the framework of resource accounting are detailed in the Design Rationale. +This CAP specifies the signatures of host functions that serve as the host-VM interface, divided into logical modules. The selection criteria of the host functions and the framework of resource accounting are detailed in the Design Rationale. ## Specification -The entire suite of host functions are broken down into logical modules, each evolving around a specific area of functionality (e.g. map, vector, BigInt). +The entire suite of host functions are broken down into logical modules, each evolving around a specific area of functionality (e.g. map, vector, integer). The host functions, which define the interface between the host environment and the virtual machine (VM), are specified in [WebAssembly text format](https://developer.mozilla.org/en-US/docs/WebAssembly/Understanding_the_text_format) to preserve generality, since implementation of the host functions are supposed to be language agnostic. There are a few properties and conventions that apply generally to all host functions, they are outlined below to avoid repeating on every function. -#### Exception safety -Execution of the host function should never cause an exception in the host environment. If the execution fails for any reason, the host will emit a trap to the VM to stop execution immediately and abort the underlying transaction. Here are a few general conditions resulting in a trap: -1. The guest runs out of gas. Resource accounting and gas fee calculation will be discussed later. -2. Trying to create new host objects when there’s no slots left. Total number of host objects cannot exceed `UINT32_MAX+1`. -3. A host object handle does not correspond to the intended host object type. -4. Invalid reference to a host object is provided. +#### Error and trap +Execution of the host function should never cause an exception in the host environment. If the execution fails for any reason, the host will emit a trap to the VM to stop the execution. +There can be an variety of reasons causing a host function execution to fail, see [error handing](#error-handling). + +In general error propagation is not specified as part of the host interface specification. The only exception is the `try_call` function (inside module `d`) function, which may the error code on failure if the error is recoverable. + +The error conditions on a host function should be self-explainatory and/or clearly documented. #### Parameter types and nomenclature -All parameters (input arguments and return value) are 64-bit integers, and they either represent a generic value or a specific subtype of host value (such as a number, a generic handle to a host object) or handle to a specific host object type. For the full definition of host value types, refer to [CAP-0046](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046.md#host-value-type). -For clarity, the input parameters are named by following rules to differentiate the value types: -- An `$obj` denotes a generic handle to a host object, this means “expect this argument to be handle to an host object, don’t care its type”. -- A handle to a specific host object type `xyz` is defined as `$obj_xyz`, e.g. `$obj_vec`. this means “expect this argument to be handle to an host object of type ‘xyz’, any other type would not work”. -- Multiple objects of the same type may have their names appended with `_a`, `_b` etc to differentiate. -- Any other names can be assumed to be a generic value, e.g. `idx`, `key`, `val`. +All parameters (input arguments and return value) are 64-bit integers, and they either represent a primitive integer value or a host value type specified in [CAP-0046-01](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046-01.md#host-value-type). + +For clarity, the input parameters are named as "name underscore type" in "snake" case. For example `v: VecObject` in Rust definition is translated to `param $v_vec_object i64`. #### Immutability -All host functions respect the immutability constraint on the host objects (see [CAP-0046](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046.md#immutability)). Any function that mutates a host object (e.g. `vec_push`) will create a new host object and return its handle. +All host functions respect the immutability constraint on the host objects (see [CAP-0046-01](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046-01.md#immutability)). Any function that mutates a host object (e.g. `vec_push`) will create a new host object and return its handle. With that, we now present the host functions. -### General host functions +### "Context" host functions (mod `x`) ``` -/// Takes the two object handles and performs deep comparison. -/// Returns: -/// -1 if a < b, -/// 0 if a == b, -/// 1 if a > b -func $obj_cmp (param $obj_a i64) (param $obj_b i64) (result i64) - -/// Get the binary contractID of the contract which invoked the -/// running contract. Traps if the running contract was not -/// invoked by a contract. -func $get_current_contract (result i64) +;; Emit a diagnostic event containing a message and sequence of `Val`s. +(func $log_from_linear_memory (param $msg_pos_u32_val i64) (param $msg_len_u32_val i64) (param $vals_pos_u32_val i64) (param $vals_len_u32_val i64) (result i64)) + +;; Get the address object of the contract which invoked the running contract. Traps if the running contract was not invoked by a contract. +(func $get_invoking_contract (result i64)) + +;; Compare two objects, or at least one object to a non-object, structurally. Returns -1 if ab, or 0 if a==b. +(func $obj_cmp (param $a_val i64) (param $b_val i64) (result i64)) + +;; Records a contract event. `topics` is expected to be a `SCVec` with length <= 4 that cannot contain `Vec`, `Map`, or `Bytes` with length > 32. +(func $contract_event (param $topics_vec_object i64) (param $data_val i64) (result i64)) + +;; Return the protocol version of the current ledger as a u32. +(func $get_ledger_version (result i64)) + +;; Return the sequence number of the current ledger as a u32. +(func $get_ledger_sequence (result i64)) + +;; Return the timestamp number of the current ledger as a u64. +(func $get_ledger_timestamp (result u64)) + +;; Returns the full call stack from the first contract call to the current one as a vector of vectors, where the inside vector contains the contract id as Hash, and a function as a Symbol. +(func $get_current_call_stack (result i64)) + +;; Causes the currently executing contract to fail immediately with a provided error code, which must be of error-type `ScErrorType::Contract`. Does not actually return. +(func $fail_with_error (param $error_error i64) (result i64)) + +;; Return the network id (sha256 hash of network passphrase) of the current ledger as `Bytes`. The value is always 32 bytes in length. +(func $get_ledger_network_id (result i64)) + +;; Get the Address object for the current contract. +(func $get_current_contract_address (result i64)) + +;; Returns the max ledger sequence that an entry can live to (inclusive). +(func $get_max_expiration_ledger (result i64)) ``` -### Integer host functions +### "Integer" host functions (mod `i`) ``` -/// Constructs an object from a u64 and returns its handle -func $obj_from_u64 (param $val u64) (result i64) +;; Convert a `u64` to an object containing a `u64`. +(func $obj_from_u64 (param $v u64) (result i64)) -/// Takes an object handle and returns u64. Traps if the object is not an u64 type. -func $obj_to_u64(param $obj i64) (result u64) +;; Convert an object containing a `u64` to a `u64`. +(func $obj_to_u64 (param $obj_u64_object i64) (result u64)) -/// Constructs an object from (param $arg1 i64) and returns its handle -func $obj_from_i64(param $val i64) (result i64) +;; Convert an `i64` to an object containing an `i64`. +(func $obj_from_i64 (param $v i64) (result i64)) -/// Takes an object handle and returns i64. Traps if the object is not an i64 type. -func $obj_to_i64(param $obj i64) (result i64) -``` +;; Convert an object containing an `i64` to an `i64`. +(func $obj_to_i64 (param $obj_i64_object i64) (result i64)) +;; Convert the high and low 64-bit words of a u128 to an object containing a u128. +(func $obj_from_u128_pieces (param $hi u64) (param $lo u64) (result i64)) -### Map operations -``` -/// Construct an empty map, returns its handle -func $map_new (result i64) +;; Extract the low 64 bits from an object containing a u128. +(func $obj_to_u128_lo64 (param $obj_u128_object i64) (result u64)) -/// Insert a new element ($key, $val) into an existing map referenced by $obj, and return the new map handle -func $map_put (param $obj_map i64) (param $key i64) (param $val i64) (result i64) +;; Extract the high 64 bits from an object containing a u128. +(func $obj_to_u128_hi64 (param $obj_u128_object i64) (result u64)) -/// Given a key, return its value from an existing map. Trap if key not found. -func $map_get (param $obj_map i64) (param $key i64) (result i64) +;; Convert the high and low 64-bit words of an i128 to an object containing an i128. +(func $obj_from_i128_pieces (param $hi i64) (param $lo u64) (result i64)) -/// Given a key, delete its value from an existing map. Trap if key not found. -func $map_del (param $obj_map i64) (param $key i64) (result i64) +;; Extract the low 64 bits from an object containing an i128. +(func $obj_to_i128_lo64 (param $obj_i128_object i64) (result u64)) -/// Return length of an existing map. -func $map_len (param $obj_map i64) (result i64) +;; Extract the high 64 bits from an object containing an i128. +(func $obj_to_i128_hi64 (param $obj_i128_object i64) (result i64)) -/// Check if a key exists in a map. Returns a static true/false value (Tag 2). -func $map_has (param $obj_map i64) (param $key i64) (result i64) +;; Convert the four 64-bit words of a u256 (big-endian) to an object containing a u256. +(func $obj_from_u256_pieces (param $hi_hi u64) (param $hi_lo u64) (param $lo_hi u64) (param $lo_lo u64) (result i64)) -/// Given a key, find the first element less than itself in the map's sorted order. -/// If such an element exists, return its key, otherwise return an SCStatus containing the error code (TBD). -func $map_prev_key (param $obj_map i64) (param $key i64) (result i64) +;; Create a U256 `Val` from its representation as a byte array in big endian. +(func $u256_val_from_be_bytes (param $bytes_bytes_object i64) (result i64)) -/// Given a key, find the first element greater than itself in the map's sorted order. -/// If such an element exists, return its key, otherwise return an SCStatus containing the error code (TBD). -func $map_next_key (param $obj_map i64) (param $key i64) (result i64) +;; Return the memory representation of this U256 `Val` as a byte array in big endian byte order. +(func $u256_val_to_be_bytes (param $val_u256_val i64) (result i64)) -/// Find and return the minimum key in the map $obj_map. -/// If key doesn't exist, return an SCStatus containing the error code (TBD). -func $map_min_key (param $obj_map i64) (result i64) +;; Extract the highest 64-bits (bits 192-255) from an object containing a u256. +(func $obj_to_u256_hi_hi (param $obj_u256_object i64) (result u64)) -/// Find and return the maximum key in the map $obj_map. -/// If key doesn't exist, return an SCStatus containing the error code (TBD). -func $map_max_key (param $obj_map i64) (result i64) +;; Extract bits 128-191 from an object containing a u256. +(func $obj_to_u256_hi_lo (param $obj_u256_object i64) (result u64)) -/// Return handle to a new vector object containing all the keys in the map $obj_map. The new vector is ordered in the original map's key-sorted order. -func $map_keys (param $obj_map i64) (result i64) +;; Extract bits 64-127 from an object containing a u256. +(func $obj_to_u256_lo_hi (param $obj_u256_object i64) (result u64)) -/// Return handle to a new vector object containing all the values in the map $obj_map. The new vector is ordered in the original map's key-sorted order. -func $map_values (param $obj_map i64) (result i64) -``` +;; Extract the lowest 64-bits (bits 0-63) from an object containing a u256. +(func $obj_to_u256_lo_lo (param $obj_u256_object i64) (result u64)) -### Vector operations -``` -/// Creates a new vector with an optional capacity hint `opt_hint`. -/// If `opt_hint` is `ScStatic::Void`, no hint is assumed and the new vector is empty. -/// Otherwise, `opt_hint` is parsed as an `u32` that represents the initial capacity of the new vector. Returns handle to the new vector. -func $vec_new(param $opt_hint i64) (result i64) +;; Convert the four 64-bit words of an i256 (big-endian) to an object containing an i256. +(func $obj_from_i256_pieces (param $hi_hi i64) (param $hi_lo u64) (param $lo_hi u64) (param $lo_lo u64) (result i64)) -/// Replaces an element at index $idx of existing vector $obj with value $val. -/// Returns handle to the new vector. -func $vec_put(param $obj_vec i64) (param $idx i64) (param $val i64) (result i64) +;; Create a I256 `Val` from its representation as a byte array in big endian. +(func $i256_val_from_be_bytes (param $bytes_bytes_object i64) (result i64)) -/// Return value from an existing vector $obj at index $idx. -func $vec_get(param $obj_vec i64) (param $idx i64) (result i64) +;; Return the memory representation of this I256 `Val` as a byte array in big endian byte order. +(func $i256_val_to_be_bytes (param $val_i256_val i64) (result i64)) -/// Deletes element from an existing vector $obj at index $idx. -/// Returns handle to the new vector. -func $vec_del(param $obj_vec i64) (param $idx i64) (result i64) +;; Extract the highest 64-bits (bits 192-255) from an object containing an i256. +(func $obj_to_i256_hi_hi (param $obj_i256_object i64) (result i64)) -/// Returns length of an existing vector $obj. -func $vec_len(param $obj_vec i64) (result i64) +;; Extract bits 128-191 from an object containing an i256. +(func $obj_to_i256_hi_lo (param $obj_i256_object i64) (result u64)) -/// Push a value $val to the back of a vector $obj. -/// Returns handle to the new vector. -func $vec_push(param $obj_vec i64) (param $val i64) (result i64) +;; Extract bits 64-127 from an object containing an i256. +(func $obj_to_i256_lo_hi (param $obj_i256_object i64) (result u64)) -/// Remove the last element from an existing vector $obj. -/// Returns handle to the new vector. -func $vec_pop(param $obj_vec i64) (result i64) +;; Extract the lowest 64-bits (bits 0-63) from an object containing an i256. +(func $obj_to_i256_lo_lo (param $obj_i256_object i64) (result u64)) -/// Returns the first element from a vector $obj -func $vec_front(param $obj_vec i64) (result i64) +;; Performs checked integer addition. Computes `lhs + rhs`, returning `ScError` if overflow occurred. +(func $u256_add (param $lhs_u256_val i64) (param $rhs_u256_val i64) (result i64)) -/// Returns the last element from a vector $obj -func $vec_back(param $obj_vec i64) (result i64) +;; Performs checked integer subtraction. Computes `lhs - rhs`, returning `ScError` if overflow occurred. +(func $u256_sub (param $lhs_u256_val i64) (param $rhs_u256_val i64) (result i64)) -/// Insert an element $val at position $idx, shifting all elements after it to the right. -/// Returns handle to the new vector. -func $vec_insert(param $obj_vec i64) (param $idx i64) (param $val i64) (result i64) +;; Performs checked integer multiplication. Computes `lhs * rhs`, returning `ScError` if overflow occurred. +(func $u256_mul (param $lhs_u256_val i64) (param $rhs_u256_val i64) (result i64)) -/// Append the vector $obj_vec_b to the end of another vector $obj_vec_a. -func $vec_append(param $obj_vec_a i64) (param $obj_vec_b i64) (result i64) +;; Performs checked integer division. Computes `lhs / rhs`, returning `ScError` if `rhs == 0` or overflow occurred. +(func $u256_div (param $lhs_u256_val i64) (param $rhs_u256_val i64) (result i64)) -/// Extract a slice from a vector $obj_vec at position $pos with length $len. -/// Returns handle to the new vector. -func $vec_slice(param $obj_vec i64) (param $pos u64) (param $len u64) (result i64) -``` +;; Performs checked exponentiation. Computes `lhs.exp(rhs)`, returning `ScError` if overflow occurred. +(func $u256_pow (param $lhs_u256_val i64) (param $rhs_u32_val i32) (result i64)) -### Invoking another function -``` -/// Calls another function $func in a contract $contract with variadic arguments stored in a vector $args_vec. -/// If the call is successful, it forwards the result of the called function. Otherwise, it traps. -func $call(param $contract i64) (param $func i64) (param $args_vec i64) (result i64) -``` +;; Performs checked shift left. Computes `lhs << rhs`, returning `ScError` if `rhs` is larger than or equal to the number of bits in `lhs`. +(func $u256_shl (param $lhs_u256_val i64) (param $rhs_u32_val i32) (result i64)) -### Big integer operations -``` -/// Constructs a BigInt from an u64. -func $bigint_from_u64(param $val u64) (result i64) +;; Performs checked shift right. Computes `lhs >> rhs`, returning `ScError` if `rhs` is larger than or equal to the number of bits in `lhs`. +(func $u256_shr (param $lhs_u256_val i64) (param $rhs_u32_val i32) (result i64)) + +;; Performs checked integer addition. Computes `lhs + rhs`, returning `ScError` if overflow occurred. +(func $i256_add (param $lhs_i256_val i64) (param $rhs_i256_val i64) (result i64)) + +;; Performs checked integer subtraction. Computes `lhs - rhs`, returning `ScError` if overflow occurred. +(func $i256_sub (param $lhs_i256_val i64) (param $rhs_i256_val i64) (result i64)) -/// Converts the value of $obj_a to an u64. Traps if the value cannot fit into u64. -func $bigint_to_u64(param $obj_a i64) (result u64) +;; Performs checked integer multiplication. Computes `lhs * rhs`, returning `ScError` if overflow occurred. +(func $i256_mul (param $lhs_i256_val i64) (param $rhs_i256_val i64) (result i64)) -/// Constructs a BigInt from an i64. -func $bigint_from_i64(param $val i64) (result i64) +;; Performs checked integer division. Computes `lhs / rhs`, returning `ScError` if `rhs == 0` or overflow occurred. +(func $i256_div (param $lhs_i256_val i64) (param $rhs_i256_val i64) (result i64)) -/// Converts the value of $obj_a to an i64. Traps if the value cannot fit into i64. -func $bigint_to_i64(param $obj_a i64) (result i64) +;; Performs checked exponentiation. Computes `lhs.exp(rhs)`, returning `ScError` if overflow occurred. +(func $i256_pow (param $lhs_i256_val i64) (param $rhs_u32_val i32) (result i64)) -/// Performs the + operation. Returns handle to the result BigInt. -func $bigint_add(param $obj_a i64) (param $obj_b i64) (result i64) +;; Performs checked shift left. Computes `lhs << rhs`, returning `ScError` if `rhs` is larger than or equal to the number of bits in `lhs`. +(func $i256_shl (param $lhs_i256_val i64) (param $rhs_u32_val i32) (result i64)) -/// Performs the - operation. Returns handle to the result BigInt. -func $bigint_sub(param $obj_a i64) (param $obj_b i64) (result i64) +;; Performs checked shift right. Computes `lhs >> rhs`, returning `ScError` if `rhs` is larger than or equal to the number of bits in `lhs`. +(func $i256_shr (param $lhs_i256_val i64) (param $rhs_u32_val i32) (result i64)) -/// Performs the * operation. Returns handle to the result BigInt. -func $bigint_mul(param $obj_a i64) (param $obj_b i64) (result i64) +;; Convert a `u64` to a `Timepoint` object. +(func $timepoint_obj_from_u64 (param $v u64) (result i64)) -/// Performs the / operation. Returns handle to the result BigInt. Traps if $obj_b is zero. -func $bigint_div(param $obj_a i64) (param $obj_b i64) (result i64) +;; Convert a `Timepoint` object to a `u64`. +(func $timepoint_obj_to_u64 (param $obj_timepoint_object i64) (result u64)) -/// Performs the % operation. Returns handle to the result BigInt. Traps if $obj_b is zero. -func $bigint_rem(param $obj_a i64) (param $obj_b i64) (result i64) +;; Convert a `u64` to a `Duration` object. +(func $duration_obj_from_u64 (param $v u64) (result i64)) -/// Performs the & operation. Returns handle to the result BigInt. -func $bigint_and(param $obj_a i64) (param $obj_b i64) (result i64) +;; Convert a `Duration` object a `u64`. +(func $duration_obj_to_u64 (param $obj_duration_object i64) (result u64)) -/// Performs the | operation. Returns handle to the result BigInt. -func $bigint_or(param $obj_a i64) (param $obj_b i64) (result i64) +``` + +### "Map" host functions (mod `m`) +``` +;; Create an empty new map. +(func $map_new (result i64)) -/// Performs the ^ operation. Returns handle to the result BigInt. -func $bigint_xor(param $obj_a i64) (param $obj_b i64) (result i64) +;; Insert a key/value mapping into an existing map, and return the map object handle. If the map already has a mapping for the given key, the previous value is overwritten. +(func $map_put (param $m_map_object i64) (param $k_val i64) (param $v_val i64) (result i64)) -/// Performs the << operation. Traps if b is negative is larger than the size of u64. -func $bigint_shl(param $obj_a i64) (param $obj_b i64) (result i64) +;; Get the value for a key from a map. Traps if key is not found. +(func $map_get (param $m_map_object i64) (param $k_val i64) (result i64)) -/// Performs the >> operation. Traps if b is negative is larger than the size of u64. -func $bigint_shr(param $obj_a i64) (param $obj_b i64) (result i64) +;; Remove a key/value mapping from a map if it exists, traps if doesn't. +(func $map_del (param $m_map_object i64) (param $k_val i64) (result i64)) -/// Returns an ordering between $obj_a and $obj_b: -1 (if a < b), 0 (if a == b), 1 (if a > b). -func $bigint_cmp(param $obj_a i64) (param $obj_b i64) (result i64) +;; Get the size of a map. +(func $map_len (param $m_map_object i64) (result i64)) -/// Returns true if $obj_a is equal to the additive identity. -func $bigint_is_zero(param $obj_a i64) (result i64) +;; Test for the presence of a key in a map. Returns Bool. +(func $map_has (param $m_map_object i64) (param $k_val i64) (result i64)) -/// Performs the unary - operation on $obj_a. Returns handle to the result BigInt. -func $bigint_neg(param $obj_a i64) (result i64) +;; Given a key, find the first key less than itself in the map's sorted order. If such a key does not exist, return an ScError. +(func $map_prev_key (param $m_map_object i64) (param $k_val i64) (result i64)) -/// Performs the unary ! operation on $obj_a. Returns handle to the result BigInt. -func $bigint_not(param $obj_a i64) (result i64) +;; Given a key, find the first key greater than itself in the map's sorted order. If such a key does not exist, return an ScError. +(func $map_next_key (param $m_map_object i64) (param $k_val i64) (result i64)) -/// Calculates the Greatest Common Divisor (GCD) of $obj_a and $obj_b. Returns handle to the result BigInt. -func $bigint_gcd(param $obj_a i64) (param $obj_b i64) (result i64) +;; Find the minimum key from a map. If the map is empty, return an ScError. +(func $map_min_key (param $m_map_object i64) (result i64)) -/// Calculates the Lowest Common Multiple (LCM) of $obj_a and $obj_b. Returns handle to the result BigInt. -func $bigint_lcm(param $obj_a i64) (param $obj_b i64) (result i64) +;; Find the maximum key from a map. If the map is empty, return an ScError. +(func $map_max_key (param $m_map_object i64) (result i64)) -/// Calculates $obj_a to the power $obj_b. Returns handle to the result BigInt. Traps if b is negative or larger than the size of u64. -func $bigint_pow(param $obj_a i64) (param $obj_b i64) (result i64) +;; Return a new vector containing all the keys in a map. The new vector is ordered in the original map's key-sorted order. +(func $map_keys (param $m_map_object i64) (result i64)) -/// Calculates ($obj_a ^ $obj_e) mod $obj_m. Note that this rounds like mod_floor, not like the % operator, which makes a difference when given a negative $obj_a or $obj_m. -/// The result will be in the interval [0, $obj_m) for $obj_m > 0, or in the interval ($obj_m, 0] for $obj_m < 0. -/// Traps if the $obj_e is negative or the $obj_m is zero. -func $bigint_pow_mod(param $obj_a i64) (param $obj_e i64) (param $obj_m i64) (result i64) +;; Return a new vector containing all the values in a map. The new vector is ordered in the original map's key-sorted order. +(func $map_values (param $m_map_object i64) (result i64)) -/// Calculates the truncated principal square root of an integer $obj_a. Returns handle to the result BigInt. -/// Traps if $obj_a is negative. -func $bigint_sqrt(param $obj_a i64) (result i64) +;; Return a new map initialized from a set of input slices given by linear-memory addresses and lengths. +(func $map_new_from_linear_memory (param $keys_pos_u32_val i64) (param $vals_pos_u32_val i64) (param $len_u32_val i64) (result i64)) -/// Determines the fewest bits necessary to express the BigInt, not including the sign. -func $bigint_bits(param $obj_a i64) (result u64) +;; Copy the Val values of a map, as described by set of input keys, into an array at a given linear-memory address. +(func $map_unpack_to_linear_memory (param $map_map_object i64) (param $keys_pos_u32_val i64) (param $vals_pos_u32_val i64) (param $len_u32_val i64) (result i64)) ``` -### XDR Binary operations +### "Vec" host functions (mod `v`) ``` -/// Serializes an object $obj into xdr opaque binary array. Returns handle to the xdr binary array object. -func $serialize_to_binary(param $val i64) (result i64) +;; Creates a new vector with an optional capacity hint `c`. If `c` is `Void`, no hint is assumed and the new vector is empty. Otherwise, `c` is parsed as a `u32` that represents the initial capacity of the new vector. +(func $vec_new (param $c_val i64) (result i64)) -/// Given the handle to a binary object $obj_bin and the object's type code $type, Deserializes the binary into an object and returns its handle. -/// Traps if the deserialization fails for any reason. -func $deserialize_from_binary(param $obj_bin i64, param $type u64) (result i64) +;; Update the value at index `i` in the vector. Return the new vector. Trap if the index is out of bounds. +(func $vec_put (param $v_vec_object i64) (param $i_u32_val i64) (param $x_val i64) (result i64)) -/// Given a host binary object $obj_bin, copies a segment of its memory specified at $offset with length $len into the linear memory at $pos. -/// Traps if either the binary object or the linear memory doesn't have enough bytes. -func $binary_copy_to_linear_memory(param $obj_bin i64) (param $offset u64) (param $pos u64) (param $len u64) +;; Returns the element at index `i` of the vector. Traps if the index is out of bound. +(func $vec_get (param $v_vec_object i64) (param $i_u32_val i64) (result i64)) -/// Copies a segment of the linear memory specified at position $pos with length $len, into a host binary object $obj_bin at $offset. The host binary may grow in size to accommodate the new bytes. -/// Returns handle to the new binary array. -/// Traps if the linear memory doesn't have enough bytes. -func $binary_copy_from_linear_memory(param $obj_bin i64) (param $offset u64) (param $pos u64) (param $len u64) (result i64) +;; Delete an element in a vector at index `i`, shifting all elements after it to the left. Return the new vector. Traps if the index is out of bound. +(func $vec_del (param $v_vec_object i64) (param $i_u32_val i64) (result i64)) -/// Constructs a new binary array initialized with bytes copied from a linear memory slice specified at position $pos with length $len. -/// Returns handle to the new binary array. -func $binary_new_from_linear_memory(param $pos u64) (param $len u64) (result i64) +;; Returns length of the vector. +(func $vec_len (param $v_vec_object i64) (result i64)) -/// Construct an empty binary array, returns its handle -func $binary_new() (result i64) +;; Push a value to the front of a vector. +(func $vec_push_front (param $v_vec_object i64) (param $x_val i64) (result i64)) -/// Replaces an element at index $idx of existing binary $obj_bin with value $val. -/// Returns handle to the new binary array. -func $binary_put(param $obj_bin i64) (param $idx i64) (param $val i64) (result i64) +;; Removes the first element from the vector and returns the new vector. Traps if original vector is empty. +(func $vec_pop_front (param $v_vec_object i64) (result i64)) -/// Returns the value at position $idx from binary array $obj_bin -func $binary_get(param $obj_bin i64) (param $idx i64) (result i64) +;; Appends an element to the back of the vector. +(func $vec_push_back (param $v_vec_object i64) (param $x_val i64) (result i64)) -/// Remove a value at position $idx from binary array $obj_bin and return handle to the new binary array. -func $binary_del(param $obj_bin i64) (param $idx i64) (result i64) +;; Removes the last element from the vector and returns the new vector. Traps if original vector is empty. +(func $vec_pop_back (param $v_vec_object i64) (result i64)) -/// Return length of a binary array $obj_bin. -func $binary_len(param $obj_bin i64) (result i64) +;; Return the first element in the vector. Traps if the vector is empty +(func $vec_front (param $v_vec_object i64) (result i64)) -/// Push a value $val to the end of a binary array. Return handle to the new binary array. -func $binary_push(param $obj_bin i64) (param $val i64) (result i64) +;; Return the last element in the vector. Traps if the vector is empty +(func $vec_back (param $v_vec_object i64) (result i64)) -/// Remove the last value from binary array $obj_bin and return handle to the new binary array. -func $binary_pop(param $obj_bin i64) (result i64) +;; Inserts an element at index `i` within the vector, shifting all elements after it to the right. Traps if the index is out of bound +(func $vec_insert (param $v_vec_object i64) (param $i_u32_val i64) (param $x_val i64) (result i64)) -/// Returns the first element from a binary array $obj_bin -func $binary_front(param $obj_bin i64) (result i64) +;; Clone the vector `v1`, then moves all the elements of vector `v2` into it. Return the new vector. Traps if number of elements in the vector overflows a u32. +(func $vec_append (param $v1_vec_object i64) (param $v2_vec_object i64) (result i64)) -/// Returns the last element from a binary array $obj_bin -func $binary_back(param $obj_bin i64) (result i64) +;; Copy the elements from `start` index until `end` index, exclusive, in the vector and create a new vector from it. Return the new vector. Traps if the index is out of bound. +(func $vec_slice (param $v_vec_object i64) (param $start_u32_val i64) (param $end_u32_val i64) (result i64)) -/// Insert a value $val to an existing binary array $obj_bin at $idx, and shift all values after it to the right. -/// Returns handle to the new binary array. -func $binary_insert(param $obj_bin i64) (param $idx i64) (param $val i64) (result i64) +;; Get the index of the first occurrence of a given element in the vector. Returns the u32 index of the value if it's there. Otherwise, it returns `Void`. +(func $vec_first_index_of (param $v_vec_object i64) (param $x_val i64) (result i64)) -/// Append the binary array $obj_bin_b to the end of another binary array $obj_bin_a. -func $binary_append(param $obj_bin_a i64) (param $obj_bin_b i64) (result i64) +;; Get the index of the last occurrence of a given element in the vector. Returns the u32 index of the value if it's there. Otherwise, it returns `Void`. +(func $vec_last_index_of (param $v_vec_object i64) (param $x_val i64) (result i64)) -/// Extract a slice from a binary array $obj_bin at position $pos with length $len. -/// Returns handle to the new binary array. -func $binary_slice(param $obj_bin i64) (param $pos u64) (param $len u64) (result i64) -``` +;; Binary search a sorted vector for a given element. If it exists, the high-32 bits of the return value is 0x0001 and the low-32 bits contain the u32 index of the element. If it does not exist, the high-32 bits of the return value is 0x0000 and the low-32 bits contain the u32 index at which the element would need to be inserted into the vector to maintain sorted order. +(func $vec_binary_search (param $v_vec_object i64) (param $x_val i64) (result u64)) -### "hash" operations -``` -/// Convert a binary object $obj_bin to a hash object. -/// Returns handle to the new hash object. -func $hash_from_binary(param $obj_bin i64) (return i64) +;; Return a new vec initialized from an input slice of Vals given by a linear-memory address and length. +(func $vec_new_from_linear_memory (param $vals_pos_u32_val i64) (param $len_u32_val i64) (result i64)) -/// Convert a hash object $obj_hash to a binary object. -/// Returns handle to the new binary object. -func $hash_to_binary(param $obj_hash i64) (return i64) -``` +;; Copy the Vals of a vec into an array at a given linear-memory address. +(func $vec_unpack_to_linear_memory (param $vec_vec_object i64) (param $vals_pos_u32_val i64) (param $len_u32_val i64) (result i64)) -### "key" operations ``` -/// Convert a binary object $obj_bin to a public key object. -/// Returns handle to the new public key object. -func $public_key_from_binary(param $obj_bin i64) (return i64) -/// Convert a public key object $obj_pubkey to a binary object. -/// Returns handle to the new binary object. -func $public_key_to_binary(param $obj_pubkey i64) (return i64) +### "Ledger" host functions (mod `l`) ``` +;; If `f` is `Void`, then there will be no changes to flags for an existing entry, and none will be set if this is a new entry. Otherwise, `f` is parsed as a `u32`. If the value is 0, then all flags are cleared. If it's not 0, then flags will be set to the passed in value. +(func $put_contract_data (param $k_val i64) (param $v_val i64) (param $t_storage_type i64) (param $f_val i64) (result i64)) + +(func $has_contract_data (param $k_val i64) (param $t_storage_type i64) (result i64)) + +(func $get_contract_data (param $k_val i64) (param $t_storage_type i64) (result i64)) + +(func $del_contract_data (param $k_val i64) (param $t_storage_type i64) (result i64)) + +;; Creates the contract instance on behalf of `deployer`. `deployer` must authorize this call via Soroban auth framework, i.e. this calls `deployer.require_auth` with respective arguments. `wasm_hash` must be a hash of the contract code that has already been uploaded on this network. `salt` is used to create a unique contract id. Returns the address of the created contract. +(func $create_contract (param $deployer_address_object i64) (param $wasm_hash_bytes_object i64) (param $salt_bytes_object i64) (result i64)) + +;; Creates the instance of Stellar Asset contract corresponding to the provided asset. `serialized_asset` is `stellar::Asset` XDR serialized to bytes format. Returns the address of the created contract. +(func $create_asset_contract (param $serialized_asset_bytes_object i64) (result i64)) + +;; Uploads provided `wasm` bytecode to the network and returns its identifier (SHA-256 hash). No-op in case if the same Wasm object already exists. +(func $upload_wasm (param $wasm_bytes_object i64) (result i64)) + +;; Replaces the executable of the current contract with the provided Wasm code identified by a hash. Wasm entry corresponding to the hash has to already be present in the ledger. The update happens only after the current contract invocation has successfully finished, so this can be safely called in the middle of a function. +(func $update_current_contract_wasm (param $hash_bytes_object i64) (result i64)) + +;; Bumps the expiration ledger of the key specified so the entry will live for `min` ledgers from now. If the current expiration ledger is already large enough to live at least `min` more ledgers, then nothing happens. +(func $bump_contract_data (param $k_val i64) (param $t_storage_type i64) (param $min_u32_val u64) (result i64)) + +;; Bumps the expiration ledger the current contract instance and code (if applicable), so they will live for at least `min` ledgers from the current ledger (not including it). +(func $bump_current_contract_instance_and_code (param $min_u32_val u64) (result i64)) + +;; Bumps the expiration ledger the instance and code (if applicable) of the provided contract, so they will live for at least `min` ledgers from the current ledger (not including it). +(func $bump_contract_instance_and_code (param $contract_address_object i64) (param $min_u32_val u64) (result i64)) + +;; Get the id of a contract without creating it. `deployer` is address of the contract deployer. `salt` is used to create a unique contract id. Returns the address of the would-be contract. +(func $get_contract_id (param $deployer_address_object i64) (param $salt_bytes_object i64) (result i64)) + +;; Get the id of the Stellar Asset contract corresponding to the provided asset without creating the instance. `serialized_asset` is `stellar::Asset` XDR serialized to bytes format. Returns the address of the would-be asset contract. +(func $get_asset_contract_id (param $serialized_asset_bytes_object i64) (result i64)) -### Cryptographic operations ``` -/// Compute the sha256 hash of a binary array and return handle to the hash object. -func $compute_hash_sha256(param $obj_bin i64) (result i64) -/// Verify signature of content encoded in binary array $obj_bin with a public key object $obj_pk against a signature $obj_sig using ed25519. -/// Will trap if verification fails -func $verify_sig_ed25519(param $obj_bin i64) (param $obj_pk i64) (param $obj_sig i64) (result i64) +### "Call" host functions (mod `d`) ``` +;; Calls a function in another contract with arguments contained in vector `args`. If the call is successful, returns the result of the called function. Traps otherwise. +(func $call (param $contract_address_object i64) (param $func_symbol i64) (param $args_vec_object i64) (result i64)) + +;; Calls a function in another contract with arguments contained in vector `args`, returning either the result of the called function or an ScError if the called function failed. +(func $try_call (param $contract_address_object i64) (param $func_symbol i64) (param $args_vec_object i64) (result i64)) -### Context host functions ``` -/// Get the contractID of the contract which invoked the -/// running contract. Traps if the running contract was not -/// invoked by a contract. -func $get_invoking_contract() -> (result i64) - -/// Records a contract event. $topics is expected to be a SVec -/// and $data is expected to be an ScVal. -/// On success, returns an SCStatus::Ok. -func $contract_event(param $topics i64)(param $data i64) -> (result i64) + +### "Buf" host functions (mod `b`) ``` +;; Serializes an (SC)Val into XDR opaque `Bytes` object. +(func $serialize_to_bytes (param $v_val i64) (result i64)) + +;; Deserialize a `Bytes` object to get back the (SC)Val. +(func $deserialize_from_bytes (param $b_bytes_object i64) (result i64)) + +;; Copies a slice of bytes from a `Bytes` object specified at offset `b_pos` with length `len` into the linear memory at position `lm_pos`. Traps if either the `Bytes` object or the linear memory doesn't have enough bytes. +(func $bytes_copy_to_linear_memory (param $b_bytes_object i64) (param $b_pos_u32_val i32) (param $lm_pos_u32_val i32) (param $len_u32_val i32) (result i64)) + +;; Copies a segment of the linear memory specified at position `lm_pos` with length `len`, into a `Bytes` object at offset `b_pos`. The `Bytes` object may grow in size to accommodate the new bytes. Traps if the linear memory doesn't have enough bytes. +(func $bytes_copy_from_linear_memory (param $b_bytes_object i64) (param $b_pos_u32_val i32) (param $lm_pos_u32_val i32) (param $len_u32_val i32) (result i64)) + +;; Constructs a new `Bytes` object initialized with bytes copied from a linear memory slice specified at position `lm_pos` with length `len`. +(func $bytes_new_from_linear_memory (param $lm_pos_u32_val i32) (param $len_u32_val i32) (result i64)) + +;; Create an empty new `Bytes` object. +(func $bytes_new (result i64)) + +;; Update the value at index `i` in the `Bytes` object. Return the new `Bytes`. Trap if the index is out of bounds. +(func $bytes_put (param $b_bytes_object i64) (param $i_u32_val i32) (param $u_u32_val i32) (result i64)) + +;; Returns the element at index `i` of the `Bytes` object. Traps if the index is out of bound. +(func $bytes_get (param $b_bytes_object i64) (param $i_u32_val i32) (result i32)) + +;; Delete an element in a `Bytes` object at index `i`, shifting all elements after it to the left. Return the new `Bytes`. Traps if the index is out of bound. +(func $bytes_del (param $b_bytes_object i64) (param $i_u32_val i32) (result i64)) + +;; Returns length of the `Bytes` object. +(func $bytes_len (param $b_bytes_object i64) (result i32)) + +;; Appends an element to the back of the `Bytes` object. +(func $bytes_push (param $b_bytes_object i64) (param $u_u32_val i32) (result i64)) + +;; Removes the last element from the `Bytes` object and returns the new `Bytes`. Traps if original `Bytes` is empty. +(func $bytes_pop (param $b_bytes_object i64) (result i64)) + +;; Return the first element in the `Bytes` object. Traps if the `Bytes` is empty +(func $bytes_front (param $b_bytes_object i64) (result i32)) + +;; Return the last element in the `Bytes` object. Traps if the `Bytes` is empty +(func $bytes_back (param $b_bytes_object i64) (result i32)) + +;; Inserts an element at index `i` within the `Bytes` object, shifting all elements after it to the right. Traps if the index is out of bound +(func $bytes_insert (param $b_bytes_object i64) (param $i_u32_val i32) (param $u_u32_val i32) (result i64)) + +;; Clone the `Bytes` object `b1`, then moves all the elements of `Bytes` object `b2` into it. Return the new `Bytes`. Traps if its length overflows a u32. +(func $bytes_append (param $b1_bytes_object i64) (param $b2_bytes_object i64) (result i64)) + +;; Copies the elements from `start` index until `end` index, exclusive, in the `Bytes` object and creates a new `Bytes` from it. Returns the new `Bytes`. Traps if the index is out of bound. +(func $bytes_slice (param $b_bytes_object i64) (param $start_u32_val i32) (param $end_u32_val i32) (result i64)) + +;; Copies a slice of bytes from a `String` object specified at offset `s_pos` with length `len` into the linear memory at position `lm_pos`. Traps if either the `String` object or the linear memory doesn't have enough bytes. +(func $string_copy_to_linear_memory (param $s_string_object i64) (param $s_pos_u32_val i64) (param $lm_pos_u32_val i64) (param $len_u32_val i64) (result i64)) + +;; Copies a slice of bytes from a `Symbol` object specified at offset `s_pos` with length `len` into the linear memory at position `lm_pos`. Traps if either the `String` object or the linear memory doesn't have enough bytes. +(func $symbol_copy_to_linear_memory (param $s_symbol_object i64) (param $s_pos_u32_val i64) (param $lm_pos_u32_val i64) (param $len_u32_val i64) (result i64)) + +;; Constructs a new `String` object initialized with bytes copied from a linear memory slice specified at position `lm_pos` with length `len`. +(func $string_new_from_linear_memory (param $lm_pos_u32_val i64) (param $len_u32_val i64) (result i64)) -### Account host functions +;; Constructs a new `Symbol` object initialized with bytes copied from a linear memory slice specified at position `lm_pos` with length `len`. +(func $symbol_new_from_linear_memory (param $lm_pos_u32_val i64) (param $len_u32_val i64) (result i64)) + +;; Returns length of the `String` object. +(func $string_len (param $s_string_object i64) (result i64)) + +;; Returns length of the `Symbol` object. +(func $symbol_len (param $s_symbol_object i64) (result i64)) + +;; Return the index of a Symbol in an array of linear-memory byte-slices, or trap if not found. +(func $symbol_index_in_linear_memory (param $sym_symbol i64) (param $slices_pos_u32_val i64) (param $len_u32_val i64) (result i64)) ``` -/// Get the low threshold for the account with ID `a` (`a` is -/// `AccountId`). Traps if no such account exists. -func $account_get_low_threshold(param $a i64) (result i64) - -/// Get the medium threshold for the account with ID `a` (`a` is -/// `AccountId`). Traps if no such account exists. -func $account_get_medium_threshold(param $a i64) (result i64) - -/// Get the high threshold for the account with ID `a` (`a` is -/// `AccountId`). Traps if no such account exists. -func $account_get_high_threshold(param $a i64) (result i64) - -/// Get the signer weight for the signer with ed25519 public key -/// `s` (`s` is `Bytes`) on the account with ID `a` (`a` -/// is `AccountId`). Returns the master weight if the signer is the -/// master, and returns 0 if no such signer exists. Traps if no -/// such account exists. -func $account_get_signer_weight(param $a i64, param $s i64) (result i64) + +### "Crypto" host functions (mod `c`) ``` +;; +(func $compute_hash_sha256 (param $x_bytes_object i64) (result i64)) -### XDR changes -This CAP builds on top of CAP-0046 by expanding the repertoire of host object types. Thus the diff is made with CAP-0046 assuming it will be merged into stellar-core before this CAP finalizes. This section will be updated to observe any changes made to the CAP-0046 XDR set. +;; +(func $verify_sig_ed25519 (param $k_bytes_object i64) (param $x_bytes_object i64) (param $s_bytes_object i64) (result i64)) + +;; Returns the keccak256 hash of given input bytes. +(func $compute_hash_keccak256 (param $x_bytes_object i64) (result i64)) +;; Recovers the SEC-1-encoded ECDSA secp256k1 public key that produced a given 64-byte signature over a given 32-byte message digest, for a given recovery_id byte. +(func $recover_key_ecdsa_secp256k1 (param $msg_digest_bytes_object i64) (param $signature_bytes_object i64) (param $recovery_id_u32_val i64) (result i64)) ``` -diff --git a/src/protocol-next/xdr/Stellar-contract.x b/src/protocol-next/xdr/Stellar-contract.x -index d299068e..c8d50593 100644 ---- a/src/protocol-next/xdr/Stellar-contract.x -+++ b/src/protocol-next/xdr/Stellar-contract.x -@@ -112,7 +112,10 @@ enum SCObjectType - SCO_MAP = 1, - SCO_U64 = 2, - SCO_I64 = 3, -- SCO_BINARY = 4 -+ SCO_BINARY = 4, -+ SCO_BIGINT = 5, -+ SCO_HASH = 6, -+ SCO_PUBLIC_KEY = 7, - - // TODO: add more - }; -@@ -126,6 +129,33 @@ struct SCMapEntry - typedef SCVal SCVec<256000>; - typedef SCMapEntry SCMap<256000>; - -+enum SCNumSign -+{ -+ NEGATIVE = -1, -+ ZERO = 0, -+ POSITIVE = 1, -+}; -+ -+union SCBigInt switch (SCNumSign sign) -+{ -+case ZERO: -+ void; -+case POSITIVE: -+case NEGATIVE: -+ opaque magnitude<256000>; -+}; -+ -+enum SCHashType -+{ -+ SCHASH_SHA256 = 0, -+}; -+ -+union SCHash switch (SCHashType type) -+{ -+case SCHASH_SHA256: -+ Hash sha256; -+}; -+ - union SCObject switch (SCObjectType type) - { - case SCO_VEC: -@@ -138,5 +168,11 @@ case SCO_I64: - int64 i64; - case SCO_BINARY: - opaque bin<256000>; -+case SCO_BIGINT: -+ SCBigInt bi; -+case SCO_HASH: -+ SCHash hash; -+case SCO_PUBLIC_KEY: -+ PublicKey publicKey; - }; - } + +### "Address" host functions (mod `a`) ``` +;; Checks if the address has authorized the invocation of the current contract function with the provided arguments. Traps if the invocation hasn't been authorized. +(func $require_auth_for_args (param $address_address_object i64) (param $args_vec_object i64) (result i64)) -### New host object types -The following new host object types are introduced on top of [CAP-0046](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046.md#host-object-types): -- Object type 6: an arbitrary precision big integer number (BigInt). Although there is no explicit numerical limit the BigInt, its size is bounded by the XDR binary length limit of 256000, which equates to numerical range from -256^256000 + 1 to 256^256000 - 1. Several operations on the BigInt also imposes numerical limits on the operand due to possible memory overflow: -1. The operand of the left and right shift operations (`bigint_shl` and `bigint_shr` cannot be negative and cannot exceed size of `u64`. -2. The exponent in `bigint_pow` cannot be negative and cannot exceed size of `u64`. +;; Checks if the address has authorized the invocation of the current contract function with all the arguments of the invocation. Traps if the invocation hasn't been authorized. +(func $require_auth (param $address_address_object i64) (result i64)) -- Object type 7: an XDR hash. -- Object type 8: an XDR public key. +;; Converts a provided 32-byte Stellar account public key to the corresponding address. This is only useful in the context of cross-chain interoperability. Prefer directly using the Address objects whenever possible. +(func $account_public_key_to_address (param $pk_bytes_bytes_object i64) (result i64)) -Of which, type 7 and 8 are XDR types, which follow the same standard semantics of XDR objects described in CAP-0046 including comparison, validity and conversion. Additional semantics are listed below. +;; Converts a provided 32-byte contract identifier to a corresponding Address object. +(func $contract_id_to_address (param $contract_id_bytes_bytes_object i64) (result i64)) -#### Comparison -- Public key / hash: if A and B are both public keys / hashes with different types, they are ordered by the type value. If A and B are of the same public key / hash types, they are ordered by the key / hash values. -- BigInt: if A and B are both BigInts with different signs, they are ordered by the sign values. If A and B are of the same signs, they are ordered by their values -- (`sign`) `magnitude`. +;; Returns the 32-byte public key of the Stellar account corresponding to the provided Address object. If the Address doesn't belong to an account, returns Val corresponding to the unit type (`()`). +(func $address_to_account_public_key (param $address_address_object i64) (result i64)) -#### Conversion for BigInt -The host object uses implementation-specific data structure to store the sign and magnitude of the BigInt. We intend to use leverage Rust's [num-bigint](https://docs.rs/num-bigint/latest/num_bigint/) crate, which internally stores the magnitude as a `vec`. However, the conversion rule described here should be easily generalizable to other implementations. +;; Returns the 32-byte contract identifier corresponding to the provided Address object. If the Address doesn't belong to an account, returns Val corresponding to the unit type (`()`). +(func $address_to_contract_id (param $address_address_object i64) (result i64)) -From XDR (`ScObject` with `type` case `SCO_BIGINT`) to a host object: +;; Authorizes sub-contract calls for the next contract call on behalf of the current contract. Every entry in the argument vector corresponds to `InvokerContractAuthEntry` contract type that authorizes a tree of `require_auth` calls on behalf of the current contract. The entries must not contain any authorizations for the direct contract call, i.e. if current contract needs to call contract function F1 that calls function F2 both of which require auth, only F2 should be present in `auth_entries`. +(func $authorize_as_curr_contract (param $auth_entires_vec_object i64) (result i64)) +``` -Construct the host object BigInt with sign consistent with the `sign` case of the `SCBigInt`. If `sign` is `SCNumSign::ZERO`, then BigInt's magnitude is left empty. Otherwise, decode the bytes stored in the `magnitude` field in big-endian (BE) into the BigInt's magnitude. For example with [num-bigint](https://docs.rs/num-bigint/latest/num_bigint/)'s implementation, the first `u64` element in the `vec` will be constructed by taking the last 8 bytes in the opaque array, reverse the order, shift each byte by `8 * index`, then compute the sum. Refer to [from_bytes_be](https://docs.rs/num-bigint/0.4.3/num_bigint/struct.BigInt.html#method.from_bytes_be) for implementation detail. +### "Test" host functions (mod `t`) +``` +;; A dummy function taking 0 arguments and performs no-op. This function is for test purpose only, for measuring the roundtrip cost of invoking a host function, i.e. host->Vm->host. +(func $dummy0 (result i64)) +``` -From a host object to XDR: +### "prng" host functions (mod `p`) +``` +;; Reseed the frame-local PRNG with a given BytesObject, which should be 32 bytes long. +(func $prng_reseed (param $seed_bytes_object i64) (result i64)) -Construct the `SCBigInt` with the correct `sign` case, consistent with the BigInt's sign. If the BigInt is zero, then it's done. Otherwise, encode the BigInt's magnitude in BE into bytes stored in the `magnitude` field. Refer to [to_bytes_be](https://docs.rs/num-bigint/latest/num_bigint/struct.BigInt.html#method.to_bytes_be) for implementation detail. +;; Construct a new BytesObject of the given length filled with bytes drawn from the frame-local PRNG. +(func $prng_bytes_new (param $length_u32_val i64) (result i64)) -#### Conversion for other types -For other types, i.e. `SCO_HASH`, `SCO_PUBLIC_KEY`, conversion between XDR and host object is to simply move around the contained value unaltered. +;; Return a u64 uniformly sampled from the inclusive range [lo,hi] by the frame-local PRNG. +(func $prng_u64_in_inclusive_range (param $lo_u64 u64) (param $hi_u64 u64) (result u64)) +;; Return a (Fisher-Yates) shuffled clone of a given vector, using the frame-local PRNG. +(func $prng_vec_shuffle (param $vec_vec_object i64) (result i64)) +``` + +### XDR changes +See [CAP-0046-01](./cap-0046-01.md#xdr-changes) for detail definition of all the host object types and semantics of their operations. ## Design Rationale -The WASM smart-contract system for the Stellar network is divided into the host context and the guest context, and the host functions define the interface between the host environment (running the host context) and the VM (running the guest code) via which guest code can interact with the compute resources and host objects. For the full definitions of the host and guest context, host environment, virtual machine, please refer to the “Components” section in [CAP-0046](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046.md#components). +The WASM smart-contract system for the Stellar network is divided into the host context and the guest context, and the host functions define the interface between the host environment (running the host context) and the VM (running the guest code) via which guest code can interact with the compute resources and host objects. For the full definitions of the host and guest context, host environment, virtual machine, please refer to the “Components” section in [CAP-0046-01](https://github.com/stellar/stellar-protocol/blob/master/core/cap-0046-01.md#components). The guest-host split allows common smart contract operations and computational heavy-lifting to be off-loaded to the host side. This reduces guest code size and results in a variety of benefits outlined in CAP-0046. However on the flip side, this potentially increases the attack surface and maintenance burden for the host function developers. Therefore it is vital to be judicious on host function selection, and it needs to be based on a clear set of criteria. The criteria we choose for host functions in this phase of the project are: - **Relevance**: the functions need to be relevant to a wide spectrum of smart contract applications. In particular, we would like to prioritize the expensive primitives that are common among smart contract operations. -- **Determinism**: produces designed outcome deterministically, both in terms of function output and gas cost. They also must perform deterministically across future version upgrades. +- **Determinism**: produces designed outcome deterministically across all relavent platforms. - **Efficiency**: must run within a reasonably limited resource restriction. Any smart contracts that run out of its resource limits will result in a trap. -- **Easy to implement**: must be reasonably straightforward to implement. This is especially relevant for the first-version of the prototype. +- **Maintainability**: must be reasonably straightforward to implement and easy to maintain. Maintainability requirement also extends to the third-party library we choose for the implementation of a particular host function. ### Additional functions/host objects may be included -The list of host functions presented above is by no means an exhaustive list of host functions that are officially determined. Rather It is to be interpreted as an initial set of recommendations based on the four criteria listed above, to serve as a starting point to be iterated on. Ultimately, the official list will be determined by the requirements and needs of the stellar ecosystem. For example, the cryptographic section currently only contains two functions which are the primitives (SHA256, ED25519) usd by the stellar core today. Ultimately it will be driven by the smart contract applications and the cryptographic primitives that are most vital to facilitating their development. - -Besides cryptographic operations, big rational numbers and string operations are two additional sections we may consider to include depending on feedback from the developer community. Additional host objects may be added accordingly as a consequence. - -### Resource accounting / Gas cost -This CAP cannot be complete without well-defined resource accounting metrics for each host function. Resource consumption needs to be accounted for on both guest and host sides. On the guest side, resource is measured by the instruction count on the WASM virtual machine as well as the size of the machine. On the host side, the resource count depends on a few main factors: -1. Computation cost is measured by the size of the input and the asymptotic complexity of the algorithm. E.g. insertion of a single element into an ordered map have complexity of `alpha x log(n)`, with `n` being the total size of the map. -2. `alpha` is the empirical cost parameter that is determined by measuring and calibrating against various host object operations (e.g. multiplying BigNums) as well as existing ledger transactions (e.g. making a payment). They thus can change over time based on the ledger state and the global fee schedule. The exact schedule of these cost parameters is beyond the scope of this CAP. -3. The memory cost is measured by the size of the input objects, as well as intermediate and final objects created during the function call (since all operations are immutable). -4. The resource accounting must sufficiently account for data structure overhead. This is especially relevant if we have a large amount of small host objects (e.g. a vector of size 1), or deeply nested host objects (vector of vectors of maps…). +The list of host functions proposed is an initial set based on the criteria above. It is not meant to be an exhaustive list. The list of host functions will be an evolving set determined based on the requirement and needs of the stellar ecosystem. -### Error handling — favor traps over errors -There are three main reasons for biasing error handling to generate traps in the host functions rather than returning errors to the guest code. -1. Minimizes the amount of redundant error-handling guest code even on the non-error path, thus reducing cost. -2. Trapping on the host in most error cases ensures errors are handled (by escalation to a transaction abort), whereas many functions that have "error code returns" can have those errors ignored, which makes contracts more likely to be buggy. -3. There is no easy way to communicate a structured error value to the user (such as a `result` or `option`) type. We would wind up either allocating an object to wrap every result from every function or commit one bit in the host value to denote a `None` value (analogous to Rust’s `std::option`). Both of these approaches introduce more complexity and are error prone. +### Resource metering +All the host functions are subject to resource metering specified in [cap-0046-01](./cap-0046-01.md). Cpu and memory consumptions are tracked by metering during the host function execution, and exceeding the resource limit will result in an `SCEC_EXCEEDED_LIMIT` error. -There is also a flexibility cost to traps. Contract developers have no opportunity to write code that is allowed to fail. For this reason, it is a goal that host functions that trap should provide a way to preemptively determine if a trap would occur if called. For example, the `vec_get` function will trap if the index argument is greater than the length of the vector, but a contract developer can use the `vec_len` function to check if this would occur before calling `vec_get`. +### Error handling +All host functions (with the exception of `try_call`) are infallible. An error generated during the host function execution will immediately result in a trap to the guest VM. The alternative approach of making all host functions fallible, i.e. including a success-or-failure signal in the returned 64-bit value. There are a few reasons favoring the infallible interface: +1. Minimizes the amount of redundant error-handling guest code even on the non-error path (which is most of the time), thus reducing code size and resource cost. +2. Trapping by default ensures errors are not hidden or forgotten, therefore makes for a safer design. +3. Including the success-or-failure signal in the return value requires additional implementation complexity on the host, which is paid for by every contract on every host function call. +4. No easy way to disambiguate "fail with status" vs "ok with status". See explaination [below](#try_call). -### `SCStatus`-Returning Host Functions -`SCStatus` is an `SCVal` case designed for conveying function-calling status (such as error code) between the host and the guest. The `SCStatus` cases will be expanded to include additional types as well as concrete host function error codes in future iterations of this CAP. +The host function reportoire should be clear on the failure conditions, and should contain enough +building blocks to help the guest preemptively decide if a failure condition will be triggered before making the call. For example, the `vec_get` function will trap if the index argument is greater than the length of the vector, but a contract developer can use the `vec_len` function to check if this would occur before calling `vec_get`. -The only host functions that returns the `SCStatus` currently are `map_prev_key`, `map_next_key`, `map_first_key` and `map_last_key`, which return an `SCStatus` containing the error code corresponding to an "element not exist" error (the exact error code and type are TBD). +#### `try_call` +The only fallible host function is `try_call`, which will return the error code as the result (instead of trapping) on failure. -But what if a host function's return value **is** a `SCStatus`? E.g. calling `vec_get` on a vector object `vec`. - -In order to disambiguate the two cases (host function returning an `SCStatus` to communicate a call status versus a host function returning a stored `SCStatus` value), we will disallow any storage of `SCStatus` in the host context. Thus a returned `SCStatus` can only be interpreted as a function call status. - -## Protocol Upgrade Transition -This CAP does not introduce any protocol changes. - -### Backwards Incompatibilities -This CAP does not introduce any backwards incompatibility. - -### Resource Utilization -By allowing smart contracts to invoke host functions operating on host objects, this CAP introduces significant change in resource utilization patterns that are discussed in Design Rationale, and will be further expanded before finalization. +One downside of allowing error code as return value is the ambiguity of "fail with status" and "ok with status". If a contract function returns an `ScError` as its ok return value, there is no other mechanism deciding if the error is the ok value or the failure status. ## Security Concerns The security concerns are the same as what have been outlined in CAP-0046. By expanding the host object repertoire and introducing host functions that create and operate on host objects on-the-fly during runtime, we’ve expanded the surface where those concerns manifest. In particular, this CAP aims to address “the risk of mis-metering of guest-controlled resources and denial of service”, by detailing the exact metrics of resource accounting on both the guest and the host side. -## Test Cases -TBD - ## Implementation -TBD. See [rs-stellar-contract-env](https://github.com/stellar/rs-stellar-contract-env) and stellar-core’s repo (branch to be added) for the prototype implementation. +Host functions have been implemented in [rs-soroban-env](https://github.com/stellar/rs-soroban-env). \ No newline at end of file diff --git a/core/cap-0046-10.md b/core/cap-0046-10.md index 24e772d9b..2a47fada0 100644 --- a/core/cap-0046-10.md +++ b/core/cap-0046-10.md @@ -4,8 +4,8 @@ CAP: 0046-10 Title: Smart Contract Budget Metering Working Group: - Owner: Jay Geng <@jayz22> - Authors: Jay Geng <@jayz22>, Graydon Hoare <@graydon> + Owner: Jay Geng <@jayz22>, Graydon Hoare <@graydon> + Authors: Jay Geng <@jayz22> Consulted: Nicolas Barry <@MonsieurNicolas>, Dmytro Kozhevin <@dmkozh> Status: Draft Created: 2022-12-20 @@ -96,7 +96,7 @@ The result of calibration for per resource type is a set of cost parameters of s ### The budget The budget for each resource type is a `ConfigSettingEntry` that is determined in consensus by the validators. The budget reflects the ledger processing capacity in accordance to the requirements in the "Requirements" section. We can start with an initial `cpu_insns` budget of 4'000'000 and `mem_bytes` of 10MB. These numbers may change before this CAP finalizes. -At every metering charging, the total charges will be compared with the budget, and if exceeds, will result in a "resource budget exceeded" host error. +At every metering charging, the total charges will be compared with the budget, and if exceeds, will result in a `SCEC_EXCEEDED_LIMIT` host error. ### XDR changes See [cap-0046 Overview](./cap-0046-01.md), specifically the `ConfigSettingEntry` which has four new additions corresponding to budget and metering for each resource type, as well as the new file [Stellar-contract-cost-type.x](../contents/cap-0046/Stellar-contract-cost-type.x) that defines the cost types `ContractCostType` and cost parameters entry `ContractCostParamEntry`. From 0b361cce1ea3f7da006893b4d4f8b179cd83ee96 Mon Sep 17 00:00:00 2001 From: Jay Geng Date: Fri, 25 Aug 2023 14:38:30 -0400 Subject: [PATCH 6/7] Update the soroban metering cap (#1376) * Update the soroban metering cap * fixup! Update the soroban metering cap * Update core/cap-0046-10.md Co-authored-by: Siddharth Suresh * Update core/cap-0046-10.md Co-authored-by: Siddharth Suresh * Update core/cap-0046-10.md Co-authored-by: Siddharth Suresh * Update core/cap-0046-10.md Co-authored-by: Siddharth Suresh * Update core/cap-0046-10.md Co-authored-by: Siddharth Suresh * Update core/cap-0046-10.md Co-authored-by: Siddharth Suresh * Update core/cap-0046-10.md Co-authored-by: Siddharth Suresh --------- Co-authored-by: Graydon Hoare Co-authored-by: Siddharth Suresh --- .../Stellar-contract-config-setting.x | 247 ++++++++++++++++++ core/cap-0046-10.md | 141 ++++------ 2 files changed, 295 insertions(+), 93 deletions(-) create mode 100644 contents/cap-0046/Stellar-contract-config-setting.x diff --git a/contents/cap-0046/Stellar-contract-config-setting.x b/contents/cap-0046/Stellar-contract-config-setting.x new file mode 100644 index 000000000..bb76b3c63 --- /dev/null +++ b/contents/cap-0046/Stellar-contract-config-setting.x @@ -0,0 +1,247 @@ +%#include "xdr/Stellar-types.h" + +namespace stellar { +// General “Soroban execution lane” settings +struct ConfigSettingContractExecutionLanesV0 +{ + // maximum number of Soroban transactions per ledger + uint32 ledgerMaxTxCount; +}; + +// "Compute" settings for contracts (instructions and memory). +struct ConfigSettingContractComputeV0 +{ + // Maximum instructions per ledger + int64 ledgerMaxInstructions; + // Maximum instructions per transaction + int64 txMaxInstructions; + // Cost of 10000 instructions + int64 feeRatePerInstructionsIncrement; + + // Memory limit per transaction. Unlike instructions, there is no fee + // for memory, just the limit. + uint32 txMemoryLimit; +}; + +// Ledger access settings for contracts. +struct ConfigSettingContractLedgerCostV0 +{ + // Maximum number of ledger entry read operations per ledger + uint32 ledgerMaxReadLedgerEntries; + // Maximum number of bytes that can be read per ledger + uint32 ledgerMaxReadBytes; + // Maximum number of ledger entry write operations per ledger + uint32 ledgerMaxWriteLedgerEntries; + // Maximum number of bytes that can be written per ledger + uint32 ledgerMaxWriteBytes; + + // Maximum number of ledger entry read operations per transaction + uint32 txMaxReadLedgerEntries; + // Maximum number of bytes that can be read per transaction + uint32 txMaxReadBytes; + // Maximum number of ledger entry write operations per transaction + uint32 txMaxWriteLedgerEntries; + // Maximum number of bytes that can be written per transaction + uint32 txMaxWriteBytes; + + int64 feeReadLedgerEntry; // Fee per ledger entry read + int64 feeWriteLedgerEntry; // Fee per ledger entry write + + int64 feeRead1KB; // Fee for reading 1KB + + // The following parameters determine the write fee per 1KB. + // Write fee grows linearly until bucket list reaches this size + int64 bucketListTargetSizeBytes; + // Fee per 1KB write when the bucket list is empty + int64 writeFee1KBBucketListLow; + // Fee per 1KB write when the bucket list has reached `bucketListTargetSizeBytes` + int64 writeFee1KBBucketListHigh; + // Write fee multiplier for any additional data past the first `bucketListTargetSizeBytes` + uint32 bucketListWriteFeeGrowthFactor; +}; + +// Historical data (pushed to core archives) settings for contracts. +struct ConfigSettingContractHistoricalDataV0 +{ + int64 feeHistorical1KB; // Fee for storing 1KB in archives +}; + +// Contract event-related settings. +struct ConfigSettingContractEventsV0 +{ + // Maximum size of events that a contract call can emit. + uint32 txMaxContractEventsSizeBytes; + // Fee for generating 1KB of contract events. + int64 feeContractEvents1KB; +}; + +// Bandwidth related data settings for contracts. +// We consider bandwidth to only be consumed by the transaction envelopes, hence +// this concerns only transaction sizes. +struct ConfigSettingContractBandwidthV0 +{ + // Maximum sum of all transaction sizes in the ledger in bytes + uint32 ledgerMaxTxsSizeBytes; + // Maximum size in bytes for a transaction + uint32 txMaxSizeBytes; + + // Fee for 1 KB of transaction size + int64 feeTxSize1KB; +}; + +enum ContractCostType { + // Cost of running 1 wasm instruction + WasmInsnExec = 0, + // Cost of growing wasm linear memory by 1 page + WasmMemAlloc = 1, + // Cost of allocating a chuck of host memory (in bytes) + HostMemAlloc = 2, + // Cost of copying a chuck of bytes into a pre-allocated host memory + HostMemCpy = 3, + // Cost of comparing two slices of host memory + HostMemCmp = 4, + // Cost of a host function dispatch, not including the actual work done by + // the function nor the cost of VM invocation machinary + DispatchHostFunction = 5, + // Cost of visiting a host object from the host object storage. Exists to + // make sure some baseline cost coverage, i.e. repeatly visiting objects + // by the guest will always incur some charges. + VisitObject = 6, + // Cost of serializing an xdr object to bytes + ValSer = 7, + // Cost of deserializing an xdr object from bytes + ValDeser = 8, + // Cost of computing the sha256 hash from bytes + ComputeSha256Hash = 9, + // Cost of computing the ed25519 pubkey from bytes + ComputeEd25519PubKey = 10, + // Cost of accessing an entry in a Map. + MapEntry = 11, + // Cost of accessing an entry in a Vec + VecEntry = 12, + // Cost of verifying ed25519 signature of a payload. + VerifyEd25519Sig = 13, + // Cost of reading a slice of vm linear memory + VmMemRead = 14, + // Cost of writing to a slice of vm linear memory + VmMemWrite = 15, + // Cost of instantiation a VM from wasm bytes code. + VmInstantiation = 16, + // Cost of instantiation a VM from a cached state. + VmCachedInstantiation = 17, + // Cost of invoking a function on the VM. If the function is a host function, + // additional cost will be covered by `DispatchHostFunction`. + InvokeVmFunction = 18, + // Cost of computing a keccak256 hash from bytes. + ComputeKeccak256Hash = 19, + // Cost of computing an ECDSA secp256k1 pubkey from bytes. + ComputeEcdsaSecp256k1Key = 20, + // Cost of computing an ECDSA secp256k1 signature from bytes. + ComputeEcdsaSecp256k1Sig = 21, + // Cost of recovering an ECDSA secp256k1 key from a signature. + RecoverEcdsaSecp256k1Key = 22, + // Cost of int256 addition (`+`) and subtraction (`-`) operations + Int256AddSub = 23, + // Cost of int256 multiplication (`*`) operation + Int256Mul = 24, + // Cost of int256 division (`/`) operation + Int256Div = 25, + // Cost of int256 power (`exp`) operation + Int256Pow = 26, + // Cost of int256 shift (`shl`, `shr`) operation + Int256Shift = 27 +}; + +struct ContractCostParamEntry { + // use `ext` to add more terms (e.g. higher order polynomials) in the future + ExtensionPoint ext; + + int64 constTerm; + int64 linearTerm; +}; + +struct StateExpirationSettings { + uint32 maxEntryExpiration; + uint32 minTempEntryExpiration; + uint32 minPersistentEntryExpiration; + uint32 autoBumpLedgers; + + // rent_fee = wfee_rate_average / rent_rate_denominator_for_type + int64 persistentRentRateDenominator; + int64 tempRentRateDenominator; + + // max number of entries that emit expiration meta in a single ledger + uint32 maxEntriesToExpire; + + // Number of snapshots to use when calculating average BucketList size + uint32 bucketListSizeWindowSampleSize; + + // Maximum number of bytes that we scan for eviction per ledger + uint64 evictionScanSize; + + // Lowest BucketList level to be scanned to evict entries + uint32 startingEvictionScanLevel; +}; + +struct EvictionIterator { + uint32 bucketListLevel; + bool isCurrBucket; + uint64 bucketFileOffset; +}; + +// limits the ContractCostParams size to 20kB +const CONTRACT_COST_COUNT_LIMIT = 1024; + +typedef ContractCostParamEntry ContractCostParams; + +// Identifiers of all the network settings. +enum ConfigSettingID +{ + CONFIG_SETTING_CONTRACT_MAX_SIZE_BYTES = 0, + CONFIG_SETTING_CONTRACT_COMPUTE_V0 = 1, + CONFIG_SETTING_CONTRACT_LEDGER_COST_V0 = 2, + CONFIG_SETTING_CONTRACT_HISTORICAL_DATA_V0 = 3, + CONFIG_SETTING_CONTRACT_EVENTS_V0 = 4, + CONFIG_SETTING_CONTRACT_BANDWIDTH_V0 = 5, + CONFIG_SETTING_CONTRACT_COST_PARAMS_CPU_INSTRUCTIONS = 6, + CONFIG_SETTING_CONTRACT_COST_PARAMS_MEMORY_BYTES = 7, + CONFIG_SETTING_CONTRACT_DATA_KEY_SIZE_BYTES = 8, + CONFIG_SETTING_CONTRACT_DATA_ENTRY_SIZE_BYTES = 9, + CONFIG_SETTING_STATE_EXPIRATION = 10, + CONFIG_SETTING_CONTRACT_EXECUTION_LANES = 11, + CONFIG_SETTING_BUCKETLIST_SIZE_WINDOW = 12, + CONFIG_SETTING_EVICTION_ITERATOR = 13 +}; + +union ConfigSettingEntry switch (ConfigSettingID configSettingID) +{ +case CONFIG_SETTING_CONTRACT_MAX_SIZE_BYTES: + uint32 contractMaxSizeBytes; +case CONFIG_SETTING_CONTRACT_COMPUTE_V0: + ConfigSettingContractComputeV0 contractCompute; +case CONFIG_SETTING_CONTRACT_LEDGER_COST_V0: + ConfigSettingContractLedgerCostV0 contractLedgerCost; +case CONFIG_SETTING_CONTRACT_HISTORICAL_DATA_V0: + ConfigSettingContractHistoricalDataV0 contractHistoricalData; +case CONFIG_SETTING_CONTRACT_EVENTS_V0: + ConfigSettingContractEventsV0 contractEvents; +case CONFIG_SETTING_CONTRACT_BANDWIDTH_V0: + ConfigSettingContractBandwidthV0 contractBandwidth; +case CONFIG_SETTING_CONTRACT_COST_PARAMS_CPU_INSTRUCTIONS: + ContractCostParams contractCostParamsCpuInsns; +case CONFIG_SETTING_CONTRACT_COST_PARAMS_MEMORY_BYTES: + ContractCostParams contractCostParamsMemBytes; +case CONFIG_SETTING_CONTRACT_DATA_KEY_SIZE_BYTES: + uint32 contractDataKeySizeBytes; +case CONFIG_SETTING_CONTRACT_DATA_ENTRY_SIZE_BYTES: + uint32 contractDataEntrySizeBytes; +case CONFIG_SETTING_STATE_EXPIRATION: + StateExpirationSettings stateExpirationSettings; +case CONFIG_SETTING_CONTRACT_EXECUTION_LANES: + ConfigSettingContractExecutionLanesV0 contractExecutionLanes; +case CONFIG_SETTING_BUCKETLIST_SIZE_WINDOW: + uint64 bucketListSizeWindow<>; +case CONFIG_SETTING_EVICTION_ITERATOR: + EvictionIterator evictionIterator; +}; +} diff --git a/core/cap-0046-10.md b/core/cap-0046-10.md index 2a47fada0..cd85a1c23 100644 --- a/core/cap-0046-10.md +++ b/core/cap-0046-10.md @@ -35,14 +35,12 @@ The metered costs must align closely to the true costs of running a smart contra In addition, metering must have: - High coverage: metering needs to cover all the non-trivial work done by the host. -- Metering needs to err on the side of worst case of the true cost. -- Metering based on the worst case must not deviate too far (10x) from the average cost. +- Moderate overestimate: Metering needs to err on the side of worst case of the true cost, but should not be too far (within the same order of magnitude) from the average true cost. ### Design goals -- Explainability – the metering model should be simple enough to understand and to explain the cost composition of a contract. +- Simplicity – the metering model should be simple enough to understand. The cost composition should be easy to explain and reason about. - Extensibility and maintainability – should be straightforward to add metering to future code. Changes in the implementation should not require rewrite of metering. Every iteration of code changes should not require complete model re-calibration. -- Metering should be cheap – the act of meter charging should not amount to a significant cost. -- Being able to detect when metering is missing in code paths. +- Efficiency – metering model should enable succinct implementation in the host that can be executed efficiently. ### Goals alignment Aligns with the general goals of the overview [cap-0046](./cap-0046.md) as as well the fee model [cap-0047-07](./cap-0046-07.md). @@ -67,9 +65,8 @@ Components and blocks may be wild or tame: - Code is **tame** if it’s code we wrote or are maintaining a fork of. ### Requirements for a component -1. Depends on a single input. -2. Independent from other components. -3. The cost of each resource type - `cpu_insns` and `mem_bytes` - follows a linear or constant characteristics w.r.t. the input. +1. Can be modeled as a constant or linear function w.r.t. to a single input, on both resource types `cpu_insns` and `mem_bytes`. +2. Does not invoke another component. I.e. components are the the leafs of a call tree. ![Call tree diagram](../contents/cap-0046/0010/Call-tree-diagram.jpg) @@ -77,11 +74,11 @@ Components and blocks may be wild or tame: Consider the host code as a tree of called blocks and components (see figure 1), with the entrypoint at the root, blocks as interior nodes and components as leafs of the tree. We structure the host in such a way that ensures as an **invariant** that **every component in the call tree is metered on every path to it**. This is done by ensuring the following: -- Blocks consist only of trivial (un-metered) code, calls to components, and calls to other blocks. +- Blocks consist of only trivial (no need to meter) code, calls to components, and calls to other blocks. - Every piece of wild component is converted to a tame component, tracked by the cost model with a unique code number assigned to it. - Components are standalone and do not call other blocks or components — they are truly the leafs of the tree. -The full list of component types are defined in `enum ContractCostType`, see "XDR changes". +The full list of component types are defined in `enum ContractCostType`, see [XDR changes](#xdr-changes). Once the call-tree invariant is satisfied, we can ensure that if every single component is metered, the entire call-tree is metered. @@ -94,38 +91,40 @@ To obtain the parameters, we isolate the component and set up a benchmark sandbo The result of calibration for per resource type is a set of cost parameters of size `C x 2`, where `C` is the number of cost types. The cost parameters per resource type form a `ConfigSettingEntry`. ### The budget -The budget for each resource type is a `ConfigSettingEntry` that is determined in consensus by the validators. The budget reflects the ledger processing capacity in accordance to the requirements in the "Requirements" section. We can start with an initial `cpu_insns` budget of 4'000'000 and `mem_bytes` of 10MB. These numbers may change before this CAP finalizes. +The budget for each resource type is a `ConfigSettingEntry` that is determined in consensus by the validators. The budget reflects the ledger processing capacity in accordance to the requirements in the [Requirements](#requirements) section. We can start with an initial `cpu_insns` budget of 2'500'000 and `mem_bytes` of 2MiB. -At every metering charging, the total charges will be compared with the budget, and if exceeds, will result in a `SCEC_EXCEEDED_LIMIT` host error. +At every metering charge, the cumulative resource consumption will be compared with the budget, and if exceeded, will result in a `SCEC_EXCEEDED_LIMIT` host error. ### XDR changes -See [cap-0046 Overview](./cap-0046-01.md), specifically the `ConfigSettingEntry` which has four new additions corresponding to budget and metering for each resource type, as well as the new file [Stellar-contract-cost-type.x](../contents/cap-0046/Stellar-contract-cost-type.x) that defines the cost types `ContractCostType` and cost parameters entry `ContractCostParamEntry`. +See [cap-0046 Overview](./cap-0046-01.md) and [Stellar-contract-config-setting.x](../contents/cap-0046/Stellar-contract-config-setting.x) for the XDR changes. In particular `ConfigSettingEntry` +contains new entries for budget and metering. `ContractCostParamEntry` defines all cost component types with their explainations in the comment. ### Metering an arbitrary new piece of code -The above have so far presented the definition of components, the list of components already identified in the host and how to calibrate each component to obtain the cost parameters. - -The main challenge of dealing with an arbitrary new piece of code (what the host starts out to be) is to identify the components through an iterative process: +The main challenge of dealing with an arbitrary new piece of code (*wild* or *tame*) is to identify the components through an iterative process: 1. Break down the code into a call tree where each node consists of meaningful, non-trivial operation. -2. Identify the leaf nodes, making sure they are components according to the “requirements for a component”. -3. For any TC, meter it according to “metering a component” -4. If it contains any wild code, follow "taming wild code” to tame it. This step needs to be done in junction with 3. +2. Identify the leaf nodes, making sure they are components according to the [requirements for a component](#requirements-for-a-component). +3. For any *tame* component, meter it according to [metering a component](#metering-a-component) +4. If it contains any *wild* code, follow [taming wild code](#taming-wild-code) to tame it. This step needs to be done in conjunction with 3. 5. Start from the leaf nodes, mark them as metered, then proceed up level by level until the reaching root. -- If a node is composed of only metered children, it is a metered block. -- Once the root is metered, the call-tree invariant is satisfied and the entire call-tree is metered. + +If a node is composed of only metered children, it is a metered block. Once the root is metered, the call-tree invariant is satisfied and the entire call-tree is metered. ### Taming wild code -As mentioned previously, one of the keys to satisfying the call-tree invariant is that all wild code, blocks or components, be tamed. This consists of the following patterns -1. A tamed block (TB) calling a tamed component (TC) -2. A wild block (WB) calling a TC -3. A TB calling a wild component (WC) -4. A TB calling a wild block (WB), where the wild block (WB) calls some other WC which we do not have access to. -For 1 and 2, metering is already covered by the TC and there is nothing else we need to do. +As mentioned previously, one of the keys to satisfying the call-tree invariant is that all *wild* code, blocks or components, be tamed. A piece of *wild* code can appear in one of the following patterns: +1. Consists of a single wild component (**WC**) +2. A wild block (**WB**) that only consists of tamed blocks (**TB**s) and tamed components (**TC**s) +3. A WB that consists of a mixture of TCs (recall a TB is just a combination of TCs) and WCs which we do not have access to. +4. A WB that consist of several WCs + +For 1, we are calling a WC which is standalone and does not call us back. We can easily tame the WC by defining it as a metered component following [metering a component](#metering-a-component). -For 3, we are calling a WC which is standalone and does not call us back. We can easily tame the WC by attaching a metering harness to it. +For 2, metering is already covered by the tamed code and there is nothing else we need to do. -The tricky scenario is 4, where a TB calls into a WB that calls into a mixture of WCs and TCs (if all of them are WCs, then the entire WB becomes a WC and we are in scenario 3). We have two options to deal with this scenario: -1. Approximate the WB as a new WC, using proper assumptions to separate out all of its logic dependencies from any TCs. Figure 2 illustrates this process and compares the call tree before and after. -2. If 1 is not possible, we have to tame it the brute force way either by forking the code and modifying it, or choose a different library, or remove this functionality altogether. +For scenario 3, we first try to approximate the WB as pure wild code, i.e. by minimizing the footprint of TCs. Concretely this means during the calibration process, set up the samples (e.g. making `x = 0` in the linear function) such that the TBs have minimal effect on the output resource consumption. If this is possible, we end up in scenario 4. See figure 2 below for illustration. + +For scenario 4, we first approximate the WB as a single WC, by picking a single dominant input and calibrate it as a linear function. If it works, we end up back to scenario 1 and we are done. + +If either 3 or 4 fails, then we have to tame it the brute force way either by forking the code and modifying it, or choosing a different library, or removing our dependency on it altogether. ![Taming wild code](../contents/cap-0046/0010/Taming-a-call-tree.jpg) @@ -135,100 +134,56 @@ The tricky scenario is 4, where a TB calls into a WB that calls into a mixture o We use cpu instruction count as the main metrics for "compute" because it is a direct proxy to process running time, i.e. `run_time = cpu_insns_count / clock_freq / ave_insns_per_cycle`. The average instructions per cycle `ave_insns_per_cycle` depends on a set of CPU architecture-specific factors such as the instruction set, instruction length, micro-ops, instruction-level parallelism (which depends on instruction window size, branch-prediction), which are stable per architecture. -Assuming 2GHz cpu with an ave. insns per cycle of 2, 4'000'000 cpu instructions roughly equals 1ms. +Assuming 2GHz cpu with an avg. insns per cycle of 2, 1ms roughly equals 4'000'000 cpu instructions. -Note that the instruction count may vary across architectures, but the metering model needs to be same across various archs, so we will need to provide a guidance on recommended setup for metering calibration. +Note that the instruction count may vary across architectures, but the metering model needs to be same, so the metering model needs to produce the upper bound on all viable architectures. -Another considered alternative resource is execution time, which relates much closer to the actual cost in ledger closing time. However, execution time is much more volatile and less-deterministic, which make it a less desirable target metric for metering. +Another considered alternative resource is execution time, which relates much closer to the actual cost in ledger closing time. However, execution time is much more volatile and less deterministic, which make it a less desirable target metric for metering. ### Why `mem_bytes` metric -The bytes of memory allocated is a good proxy of the memory footprint of contract execution. The majority of the smart contract memory footprint comes from 1. a fixed-sized linear memory 2. immutable host objects created during contract execution, and both of these are not freed until the end of contract execution. This memory model is very similar to the arena allocator. Using allocated memory as the metric is an worst-case approximation that is 1. close to the actual memory cost 2. gives us flexibility to switch to an actual arena allocator which would make it the actual cost. - - +The bytes of memory allocated is a good proxy of the memory footprint of contract execution. The majority of the smart contract memory footprint comes from 1. a fixed-sized linear memory 2. immutable host objects created during contract execution, and both of these are not freed until the end of contract execution. This memory model is very similar to the arena allocator. Using allocated memory as the metric is an worst-case approximation that is 1. close to the actual memory cost 2. gives us flexibility to switch to an actual arena allocator later. ### Why do we have to model the costs? -In other words, why can't we profile the contract at runtime and use the results directly for metering? Because the profiling results are non-deterministic and 1. we can't use them for consensus 2. the contract execution outcome won't be able to be replayed bit-identically. Using an analytical model ensure determinism for consensus and replayability (more on this later). +In other words, why can't we measure and use the runtime resource consumption for metering? Because the profiling results are non-deterministic and 1. we can't use them for consensus 2. the contract execution outcome won't be able to be replayed bit-identically. Using an analytical model ensure determinism for consensus and replayability (more on this later). ### Why linear and constant components only? -The obvious reason is simplicity. We want the costs to follow a simple linear characteristic such that we can fit it accurately without needing a complex numerical model (and fitting process, heuristics etc). - -A model with higher order dependencies also risk the worst-case costs significantly outweighing the average, and any small deviation in the input resulting in significant over or underestimation of the costs. This goes against the design goals. +Simplicity. We want the costs to follow a simple linear characteristic such that we can fit it accurately without needing a complex numerical model (and fitting process, heuristics etc). +A model with higher order dependencies also risk the worst-case costs significantly outweighing the average, and any small deviation in the input resulting in significant over or underestimation of the costs. This goes against the [design goals](#design-goals). ### Host vs WASM vm -This metering framework is generic and does not differentiate between the host and the WASM vm. Both the host and the vm are treated as components and blocks defined in the "specification" section and subject to the same metering procedures. - -Our current choice of the WASM virtual machine implementation is Wasmi, which is a lightweight interpreter of the wasm standard, written in the same language (Rust) as the host. Wasmi runs an inner interpreter loop that executes a single wasm instruction on each loop. Thus every wasm instruction logic fits the requirements of a component. `WasmInsnExecT0~4` in `ContractCostType` are designated for the wasm instructions (instead of having one type designated to each of the 100+ wasm instructions, we group them into tiers 0~4 where each tier of wasm instructions costs relatively the same amount of cpu insns). - -We maintain a fork of Wasmi with metering added. This makes Wasmi is a tamed "wild component". - -(Note this does not mean we are tied to a particular wasm implementation, it's just an example. If we decide to switch to a different interpreter or JIT in the future, we will be able to apply the same procedure to derive a new set of metering components.) +This metering framework is generic and does not differentiate between the host and the WASM vm. Both the host and the vm are treated as components and blocks defined in the [specification](#specification) section and subject to the same metering procedures. +Our current choice of the WASM virtual machine implementation is Wasmi, which is a lightweight interpreter of the wasm standard, written in the same language (Rust) as the host. Wasmi runs an inner interpreter loop that executes a single wasm instruction on each loop. Thus every wasm instruction logic fits the requirements of a component. `WasmInsnExec` in `ContractCostType` is designated for the wasm instructions. ### Relation to cap-0046-07 (fee model) -[CAP-0046-07](./cap-0046-07.md) proposed a fee model for smart contracts taking into account ledger access, storage and computation (or "gas"). This CAP details the computation aspect. However, this proposal identifies cpu and memory as separate aspects of the compute cost that needs to be budgeted separately. This difference needs to be resolved before this CAP finalize, i.e., either expand gas network settings in 07 or consolidate the `cpu_insns` and `mem_bytes` into a single "gas" parameter in here. +[CAP-0046-07](./cap-0046-07.md) proposed a fee model for smart contracts taking into account ledger access, storage and computation. This CAP details the computation aspect which includes cpu and memory. The metered `cpu_insns` goes into the fee model as input to the "compute" fee. While `mem_bytes` is not part of the fee model, it is subject to the network limit. ### Cost estimation -This proposal relies on the "preflight" mechanism to provide users with cost estimation of a transaction. The total costs for each resource type as well as inputs to each individual cost type will be returned from the preflight simulation. These costs, however can only serve as guidance to the actual cost, since the ledger snapshot used for preflight may be outdated. Thus it is not guaranteed that a transaction staying below the budget during preflight will not exceed it during the actual run. +This proposal relies on the "preflight" mechanism to provide an estimation of the cpu and mem consumption in a transaction. These can only serve as guidance to the actual cost, since the ledger snapshot used for preflight may be outdated, as well as the actual logic during preflight and actual ("recording" vs "enforcing") modes may be different. Thus it is not guaranteed that a transaction staying below the budget during preflight will not exceed it during the actual run. -## Parameters Upgrade -Both the budget and metering parameters are stored on the ledger as `ConfigLedgerEntry` and their upgrade and validation process have been discussed in [CAP-0046-09](./cap-0046-09.md). In general, the parameters can be upgraded with or without a protocol version upgrade. +### Config Settings Upgrade +Both the budget and metering parameters are stored on the ledger as `ConfigSettingEntry` and their upgrade and validation process have been discussed in [CAP-0046-09](./cap-0046-09.md). In general, the settings can be upgraded with or without a protocol version upgrade. -In the case of a protocol version upgrade, here are the scenarios where the parameters also has to be upgraded: -- New blocks have been introduced in the host that require introducing new components. Such changes include e.g. a new crypto primitive function. Note that if a new block merely consists of trivial code and calling existing components, then it has no effect on metering and no upgrade is needed. -- Changes on the host components, or version changes in its dependencies (e.g. Rust) that result in observable difference in components' cost characteristics. In rare cases, if the cost characteristics becomes no longer linear, then the component needs to be broken down into finer sub-components. See "Taming wild code" section above. +In the case of a protocol version upgrade, here are the scenarios that also require a settings upgrade: +- New blocks have been introduced in the host that require introducing new components. Such changes include e.g. a new crypto primitive function. Note that if a new block merely consists of trivial code and calling existing components, then no settings upgrade is needed. +- Changes on the host components, or version changes in its dependencies that result in observable difference in components' cost characteristics. In rare cases, if the cost characteristics becomes no longer linear, then the component needs to be broken down into finer sub-components. See [Taming wild code](#taming-wild-code). ### The “metered” stamp We may need to introduce a new mechanism for stamping the metered entities in the host, following the definitions of wild/tamed components/blocks outlined in previous section. Such a mechanism would help us ensuring the call-tree invariant is satisfied by examining the root block. A further mechanism to automatically detect if metering is missing on a path would be even more ideal. We will also need to introduce set of reviewing standards that differentiates between block vs component changes. A metered component is subject to significantly higher bars for review and audit, to make sure the component criteria are truly satisfied, as they are the foundational building blocks of the budget metering framework. -## Open Issues +In the future we may add tooling around ensuring metering coverage and assisting with updating parameters or adding new metered components. ### Maintainability The cost parameters need to be maintained to prevent the metering model from gradually deviating away from reality (model drift). Even if we maintain the same host unchanged, the host's dependencies may change that result in small performance differences which can accumulate over time, causing the cost models to drift. To combat that, we will need to publish a set of specs where the metering calibration benchmark needs to be run regularly, along with a suite of tests and criteria for determining when the model parameters need to be updated. -### Versioning and Replayability - -Although the metering models are deterministic, the model inputs may vary across different software versions. For example, consider a third-party library routine that calls our host object comparison component `obj_compare` for an unknown number of times. The metering of that routine is therefore delegated to `obj_compare`. If a software upgrade happens to the routine which results in the number of `obj_compare` call to be increased from `N` to `N+1`, the cost will be different (which may effect the success-or-failure status of a contract) even though no other observable difference exists. In other words, due to the intricate relations between metering logic and the code logic under execution, the surface area of observable differences between a transaction's execution and its replay have been enlarged. - -This is not a problem for consensus, as long as all the validators maintain the exact same software version. There are two options to solve the replay problem: -1. Maintain multiple software versions simultaneously. For an old protocol version, its exact host software version needs to be included in the current stellar-core. A version map between protocol version and the host software version needs to be maintained and looked up during replay. In practice, the number of software versions could be less than the number of protocol versions, since a protocol version upgrade may not result in observable differences in any of the transactions' replay between the old and the new version. In which case, the older software version can be retired and replaced by the newer version, but this is more of an exception. -2. Make the cost results irrelevant in replay. In other words, relax the bit-identicalness requirement for contract execution costs. During replay: - - On a successful SC transaction, take the fee due to contract execution (cpu and memory costs) as the "truth" (in order to produce the correct hash), and ignore the metering logic which arrive to those results. Also set the budget to unlimited so the replay transaction cannot fail due to out of budget. - - On a failed SC transaction, skip the transaction. A failed SC transaction ought to not have any side effects, so that it is safe to be skipped. - -The main pro of option 1 is that it preserves the bit-identicalness property of replay, however, at the cost of increased maintenance burden. - -The rationale for option 2, besides easier software maintenance, is that the accounting logic of metering should not have any significance besides arriving at the success-or-failure status and the fee charged of a contract transaction. In other words, no other side effects should be produced as a result of metering that is relevant to the observable outcomes of a transaction, thus justifies the choice of skipping the metering process altogether during replay. - -However, there are several cons of option 2: -- Adds the limitation that metering cannot produce any side effect besides the execution cost numbers, which must be the end results in all current and future transactions. This prohibits the possibility that a contract transaction relies on intermediate execution cost results as part of its logic, such as deciding whether or not to call another contract based on how much budget it has remaining. -- Places a dependency of replay on the transaction results. -- The budget metering is just a special case in the broader issue of host software versioning. Even without budget metering, the surface area of potential differences between a live execution and its replay is already large and unpredictable, thus necessitates host multi-versioning. The budget metering just increases such surface area. - -Based on above concerns, option 1 is likely the preferred option. - -The broader issue of host software versioning will be discussed in a different chapter and must be finalized before this CAP finalizes. - ## Security Concerns Missed or inaccurate metering can cause security concerns in two aspects: - **Denial of Service**: the computed costs significantly underestimate the true cost of running a contract, this can slowdown the validators and prevent them to close the ledger in an acceptable time frame. - **Under-Utilization of the Ledger Capacity**: this is not a direct attack per se. However, a side effect of overestimation in metering, is the ledger could be filled with many (deliberately crafted) fast contract transactions which theoretically could require more resource at the worst case, causing the ledger to be under-utilized. This may in turn cause other (important) transactions to queue up and not making into the ledger in a reasonable time. ## Implementation -The budget and metering, calibration has been implemented in the host, primarily: -- [PR 118](https://github.com/stellar/rs-soroban-env/pull/118) contains the initial budget and metering framework -- [PR 307](https://github.com/stellar/rs-soroban-env/pull/307) more comprehensive coverage of metering -- [PR 561](https://github.com/stellar/rs-soroban-env/pull/561) adds the calibration framework -- [PR 597](https://github.com/stellar/rs-soroban-env/pull/597) calibration for wasm instructions - -in Wasmi (our fork of the Wasm interpreter): -- [PR 1](https://github.com/stellar/wasmi/pull/1) -- [PR 10](https://github.com/stellar/wasmi/pull/10) - -and in the sdk: -- [PR 789](https://github.com/stellar/rs-soroban-sdk/pull/789) - -The stellar-core side implementation has not been done yet. \ No newline at end of file +Metering, budget and calibration has been implemented in the [soroban-env](https://github.com/stellar/rs-soroban-env). Related integration work (such as the config settings) have been done in stellar-core and [soroban-sdk](https://github.com/stellar/rs-soroban-sdk). \ No newline at end of file From d0ef9fb99bd01b02c0924717007d13329e5e6dc5 Mon Sep 17 00:00:00 2001 From: Philip Liu <12836897+philipliu@users.noreply.github.com> Date: Wed, 30 Aug 2023 15:26:49 -0400 Subject: [PATCH 7/7] SEP-6: Add support for asynchronous deposit instructions (#1379) ### Proposal This adds new fields to the `transaction` object to facilitate Anchors providing deposit instructions outside of `GET /deposit` response. ### Backwards Compatability This change is backward compatible as it does not introduce new required fields if Anchors can provide deposit instructions in the `GET /deposit` response. Resolves https://github.com/stellar/stellar-protocol/issues/1372, https://github.com/stellar/stellar-protocol/issues/1368. --- ecosystem/sep-0006.md | 76 +++++++++++++++++++++++++++++++++++++------ 1 file changed, 66 insertions(+), 10 deletions(-) diff --git a/ecosystem/sep-0006.md b/ecosystem/sep-0006.md index 9a33c7a8e..5f3113640 100644 --- a/ecosystem/sep-0006.md +++ b/ecosystem/sep-0006.md @@ -7,7 +7,7 @@ Author: SDF Status: Active (Interactive components are deprecated in favor of SEP-24) Created: 2017-10-30 Updated: 2023-08-15 -Version 3.19.0 +Version 3.20.0 ``` ## Simple Summary @@ -358,7 +358,8 @@ The response body should be a JSON object with the following fields: Name | Type | Description -----|------|------------ -`how` | string | Terse but complete instructions for how to deposit the asset. In the case of most cryptocurrencies it is just an address to which the deposit should be sent. +`how` | string | (**Deprecated**, use `instructions` instead) Terse but complete instructions for how to deposit the asset. In the case of most cryptocurrencies it is just an address to which the deposit should be sent. +`instructions` | object | (optional) JSON object containing the [SEP-9 financial account fields](sep-0009.md#financial-account-fields) that describe how to complete the off-chain deposit. If the anchor cannot provide this information in the response, the wallet should query the [`/transaction`](#single-historical-transaction) endpoint to get this asynchonously. `id` | string | (optional) The anchor's ID for this deposit. The wallet will use this ID to query the [`/transaction`](#single-historical-transaction) endpoint to check status of the request. `eta` | int | (optional) Estimate of how long the deposit will take to credit in seconds. `min_amount` | float | (optional) Minimum amount of an asset that a user can deposit. @@ -367,18 +368,53 @@ Name | Type | Description `fee_percent` | float | (optional) Percentage fee (if any). In units of percentage points. `extra_info` | object | (optional) JSON object with additional information about the deposit process. +`instructions` fields: + +An object with SEP-9 financial account fields as keys and its values are objects with the following fields: + +Name | Type | Description +-----|------|------------ +`value` | string | The value of the field. +`description` | string | A human-readable description of the field. This can be used to provide any additional information about fields that are not defined in the SEP-9 standard. + `extra_info` fields: Name | Type | Description -----|------|------------ `message` | string | (optional) Additional details about the deposit process. +##### Examples + +Bank payment example: +```json +{ + "id": "9421871e-0623-4356-b7b5-5996da122f3e", + "instructions": { + "organization.bank_number": { + "value": "121122676", + "description": "US bank routing number" + }, + "organization.bank_account_number": { + "value": "13719713158835300", + "description": "US bank account number" + } + }, + "how": "Make a payment to Bank: 121122676 Account: 13719713158835300" +} +``` + Bitcoin response example: ```json { - "how" : "1Nh7uHdvY6fNwtQtM1G5EZAFPLC33B59rB", "id": "9421871e-0623-4356-b7b5-5996da122f3e", + "instructions": { + "organization.crypto_address": { + "value": "1Nh7uHdvY6fNwtQtM1G5EZAFPLC33B59rB", + "description": "Bitcoin address" + } + }, + "how": "Make a payment to Bitcoin address 1Nh7uHdvY6fNwtQtM1G5EZAFPLC33B59rB", "fee_fixed" : 0.0002 } ``` @@ -387,10 +423,20 @@ Ripple response example: ```json { - "how" : "Ripple address: rNXEkKCxvfLcM1h4HJkaj2FtmYuAWrHGbf tag: 88", "id": "9421871e-0623-4356-b7b5-5996da122f3e", + "instructions": { + "organization.crypto_address": { + "value": "rNXEkKCxvfLcM1h4HJkaj2FtmYuAWrHGbf", + "description": "Ripple address" + }, + "organization.crypto_memo": { + "value": "88", + "description": "Ripple tag" + } + }, + "how": "Make a payment to Ripple address rNXEkKCxvfLcM1h4HJkaj2FtmYuAWrHGbf with tag 88", "eta": 60, - "fee_percent" : 0.1, + "fee_percent": 0.1, "extra_info": { "message": "You must include the tag. If the amount is more than 1000 XRP, deposit will take 24h to complete." } @@ -401,8 +447,14 @@ Mexican peso (MXN) response example: ```json { - "how" : "Make a payment to Bank: STP Account: 646180111803859359", "id": "9421871e-0623-4356-b7b5-5996da122f3e", + "instructions": { + "organization.clabe_number": { + "value": "646180111803859359", + "description": "CLABE number" + } + }, + "how": "Make a payment to Bank: STP Account: 646180111803859359", "eta": 1800 } ``` @@ -457,7 +509,7 @@ Using this feature, anchors will no longer have to wait until the user's Stellar 1. Make a request to `/deposit` and provide the `claimable_balance_supported=true` request parameter. 2. Register the user's KYC information with the anchor via [SEP-12](sep-0012.md) if requested and resubmit the deposit request. -3. Once a successful deposit request has been made and the transaction's status is `pending_user_transfer_start`, the user must send the required payment as described by the `how` attribute in the deposit success response, using the `amount_in` returned from the `GET [SEP-6]/transaction?id=` request. +3. Once a successful deposit request has been made and the transaction's status is `pending_user_transfer_start`, the user must send the required payment as described by the `instructions` attribute in the deposit success response, using the `amount_in` returned from the `GET [SEP-6]/transaction?id=` request. 4. If the anchor doesn't support claimable balances, the anchor's `/transaction(s)` endpoint will contain the `pending_trust` status. In this case, use the flow described [above](#stellar-account-doesnt-trust-asset). 5. Otherwise, detect the `claimable_balance_id` value populated in the anchor's `/transaction(s)` endpoint or poll Horizon's [/claimable_balances](https://developers.stellar.org/api/resources/claimablebalances/) endpoint for outstanding claimable balances. When a claimable balance is detected using either method, the transaction status should be `completed`. 6. Claim the balance using the value via the `ClaimClaimableBalance` operation. See the ["Claiming Claimable Balances"](#claiming-claimable-balances) section to learn more about how to claim a balance. @@ -943,7 +995,7 @@ All assets listed in a `deposit` and `deposit-exchange` can contain these attrib * `enabled`: `true` if SEP-6 deposit for this asset is supported * `authentication_required`: Optional. `true` if client must be [authenticated](#authentication) before accessing the deposit endpoint for this asset. `false` if not specified. -* `fields` object as explained below. +* `fields` (**Deprecated**, Accepting personally identifiable information through request parameters is a security risk due to web server request logging. KYC information should be supplied to the Anchor via SEP-12) `fields` object is explained below. Deposit assets listed in the `deposit` object can also contain the attributes: @@ -1123,6 +1175,9 @@ Name | Type | Description `refunds` | object | (optional) An object describing any on or off-chain refund associated with this transaction. The schema for this object is defined in the [Refunds Object Schema](#refunds-object-schema) section below. `required_info_message` | string | (optional) A human-readable message indicating any errors that require updated information from the user. `required_info_updates` | object | (optional) A set of fields that require update from the user described in the same format as [/info](#info). This field is only relevant when `status` is `pending_transaction_info_update`. +`required_customer_info_message` | string | (optional) A human-readable message indicating why the SEP-12 information provided by the user is not sufficient to complete the transaction. +`required_customer_info_updates` | string | (optional) A set of SEP-9 fields that require update from the user via SEP-12. This field is only relevant when `status` is `pending_customer_info_update`. +`instructions` | string | (optional) JSON object containing the [SEP-9 financial account fields](sep-0009.md#financial-account-fields) that describe how to complete the off-chain deposit in the same format as the [/deposit](#deposit) response. This field should be present if the `instructions` were provided in the [/deposit](#deposit) response or if it could not have been previously provided synchronously. This field should only be present once the status becomes `pending_user_transfer_start`, not while the transaction has any statuses that precede it such as `incomplete`, `pending_anchor`, or `pending_customer_info_update`. `claimable_balance_id` | string | (optional) ID of the Claimable Balance used to send the asset initially requested. Only relevant for deposit transactions. `status` should be one of: @@ -1363,7 +1418,7 @@ Every HTTP status code other than `200 OK` will be considered an error and in th ## Pending Customer Info Update -In certain cases the anchor may need updated customer information from the user. For example, the bank could tell the anchor that the account address does not match the user's name or other identifying information. Since this information was sent via SEP-12, the transaction should go into the `pending_customer_info_update` status until the sender makes another `PUT /customer` request to update. The sending anchor can check which fields need to be updated by making a `GET /customer` request including the `id` or `account` & `memo` parameters. The anchor should respond with a `NEEDS_INFO` status and include the fields that need to be updated. +In certain cases the anchor may need updated customer information from the user. For example, the bank could tell the anchor that the account address does not match the user's name or other identifying information. Since this information was sent via SEP-12, the transaction should go into the `pending_customer_info_update` status until the sender makes another `PUT /customer` request to update by providing the fields from `required_customer_info_updates` in the transaction object. The wallet can also check which fields need to be updated by making a `GET /customer` request including the `id` or `account` & `memo`. The anchor should respond with a `NEEDS_INFO` status and include the fields that need to be updated. ## Pending Transaction Info Update @@ -1422,7 +1477,8 @@ If the information was malformed, or if the sender tried to update data that isn ## Changelog -* `v3.19.0`: Deprecate `/fee` endpoint +* `v3.20.0`: Add support for asynchronous deposit instructions. ([#1379](https://github.com/stellar/stellar-protocol/pull/1379/)) +* `v3.19.0`: Deprecate `/fee` endpoint.([#1381](https://github.com/stellar/stellar-protocol/pull/1381)) * `v3.18.1`: Fix the missing types of the `withdraw` request parameters and some typo. ([#1365](https://github.com/stellar/stellar-protocol/pull/1365)) * `v3.18.0`: Added `refunded` status and `updated_at` transaction fields to match other SEPs (24, 31) ([#1336](https://github.com/stellar/stellar-protocol/pull/1336)) * `v3.17.1`: Allow anchors to omit the deprecated `X-Stellar-Signature` header ([#1335](https://github.com/stellar/stellar-protocol/pull/1335))