Preamble

CAP: 0046-01 (formerly 0046)
Title: WebAssembly Smart Contract Runtime Environment
Working Group:
    Owner: Graydon Hoare <@graydon>
    Authors: Graydon Hoare <@graydon>
    Consulted: Leigh McCulloch <@leighmcculloch>, Tomer Weller <@tomerweller>, Jon Jove <@jonjove>, Nicolas Barry <@MonsieurNicolas>, Thibault de Lacheze-Murel <@C0x41lch0x41>
Status: Draft
Created: 2022-04-18
Discussion: https://groups.google.com/g/stellar-dev/c/X0oRzJoIr10
Protocol version: TBD

Simple Summary

This CAP specifies the lowest-level code execution and data model aspects of a WebAssembly-based (WASM) "smart contract" system for the Stellar network. WASM smart contract code runs as a guest inside of a virtual machine (VM) which is embedded in a host environment.

Higher-level components of a smart contract system such as ledger entries, host objects and host functions, and transactions to manage and invoke contracts will be specified in additional CAPs. This CAP focuses only on the lowest-level components.

No new operations or ledger entries are introduced in this CAP. Nothing observably changes in the protocol available to users. This CAP is best understood as a set of building blocks for later CAPs, introducing a vocabulary of concepts, data types and implementation components.

The design in this CAP is derived from a working and much more complete prototype that includes much that is left out of this CAP. This CAP is being proposed separately to facilitate early discussion of the building blocks, and to help decompose the inevitably-large volume of interrelated changes required for a complete smart contract system into smaller, more understandable pieces.

Working Group

This protocol change was authored by Graydon Hoare, with input from the consulted individuals mentioned at the top of this document.

Motivation and Goals Alignment

See the Soroban overview CAP.

Requirements

Primary requirements

The primary requirement for any smart contract system is to enable, within certain parameters, arbitrary new functionality to be added to a blockchain's state-transition function by users. This can be further decomposed to two requirements of concern in this CAP:

Code: stellar-core's state transition function must be extended with some means of executing, within parameters, some form of user-provided Turing-complete instruction code. Preferably in a compact form that can be stored within the ledger.
Data: stellar-core's model of data -- comprising transaction input, output, persistent state and temporary working memory -- must be extended to include data of concern to smart contracts: their input, output, persistent state, and temporary working memory during execution. Any transformations between each of these sorts of data must be specified, even if partially delegated to contract logic.

Required parameters to mitigate risks

While the primary requirements seem simple enough to meet -- "just add a VM" -- there are many risks associated with a naive implementation. Therefore subsequent requirements take the form of parameters that constrain implementations in order to mitigate risks, including:

Secure: the smart contract system should be secure against benign or malicious smart contract code as well as contract-code input that could imperil system availability, integrity, or confidentiality (in the few cases where secret data exists). In particular at the level of this CAP, the design should guard against:
- The risk of resource exhaustion, leading to denial of service by validators.
- The risk of VM escape, leading to arbitrary Byzantine failures on validators, including data corruption or unauthorized transactions.
- The risk of side channels, allowing VM code to extract validator private keys or other secret data on validators.
- The risk of unintended contract behaviour due to invocation with malicious input data.
- The risk of unintended contract behaviour due to calls to or from malicious contracts.
Well-defined: the smart contract system should not compromise the network's bit-precise consensus or historical replay functions, and should have a well-defined and unambiguous semantics for any code or data added by users. Where possible this should be maintained by reference to existing, well-defined standards. In particular at the level of this CAP, the design should guard against:
- The risk of underspecified or nondeterministic VM code.
- The risk of underspecified or nondeterministic datatypes.
Performant: the smart contract system should not compromise the performance of the network, and should perform competitively with other smart contract systems. Users should not be subject to a significant performance penalty for using smart contracts instead of built-in transactions. In particular at the level of this CAP, the design should guard against:
- The risk of needing to load, compile, instantiate or run a large amount of VM code per transaction. Contracts should be small.
- The risk of contending on shared mutable data that may defeat parallel execution of transactions. Contracts should be isolated.
- The risk of requiring smart contract developers to do extensive optimization to achieve acceptable performance.
Interoperable: the smart contract system will necessarily introduce some new user-defined semantics which are by definition unknown to some users and 3rd parties. But beyond such necessary risks, the smart contract system should avoid introducing unnecessary hazards to interoperability, especially through choice of data encoding for input, output and persistent state. In particular at the level of this CAP, the design should guard against:
- The risk of being unable to share data between different contracts, or different versions of the same contract.
- The risk of being forced to write contracts in, or invoke contracts from, a single programming language.
- The risk of having no tools or only immature tools for working with any programming language targeting the VM.
- The risk of being unable to passively observe contract state for testing, debugging, diagnosis or monitoring.
- The risk of 3rd parties being unable to exchange data with contracts.
Simple: the smart contract system should be as simple as possible while achieving other requirements. It should not require excessive innovation or expensive engineering by either developers or users of stellar-core. Smart contracts are late in coming to the Stellar Network, there is plenty of prior art to draw from, and there is a limited window of time to complete the work. At the level of this CAP, the design should guard against:
- The risk of designing or implementing a novel VM, programming language, client library, or serializaiton format.
- The risk of selecting an existing platform that is incompatible with or causes major changes to stellar-core.
- The risk of delivering a system that is too challenging to learn for users or 3rd parties.

Abstract

The specification consists of three parts:

A general description of the concepts of host and guest contexts, their relationships, constraints, and methods of implementation.
A specification of the new components that provide the host and guest contexts, their means of interaction, and their lifecycle phases.
A specification of the data model shared between host and guest.

Specification

Context

This CAP specifies aspects of two separate but related contexts:

The host context: this consists of portions of the existing C++ code making up stellar-core that can be accessed by smart contracts, as well as some new C++ and Rust code implied by this CAP. New C++ and Rust code includes the implementation of a WebAssembly (WASM) virtual machine, a set of host objects, and a host environment that contains and manages the lifecycle and interaction of the host objects and virtual machines. The host environment, like the rest of stellar-core, is compiled to native code and runs with full access to its enclosing operating system environment, the ledger, the network, etc. The term "host environment" here corresponds to the term with that name in the WebAssembly specification.
The guest context: this consists of WASM code interpreted by a WASM virtual machine embedded in the host environment. Guest code may originate in any programming language able to target WASM, and will be provided by means unspecified in this CAP. Guest code has very limited access to its enclosing host environment: it can only consume CPU and memory resources to the extent that the host environment permits, and it can only call host functions that the host environment explicitly provides access to. The purpose of the guest context is to act as a so-called "sandbox" to attenuate potential harms caused by erroneous or malicious guest code, while allowing "just enough" programmability to satisfy the needs of users.

Components

The guest and host contexts are provided by two new components added to stellar-core: a virtual machine and a host environment.

Virtual Machine

Code for a WebAssembly 1.0 virtual machine (VM) is embedded in stellar-core. The VM can be instantiated multiple times in the same stellar-core process, effectively supporting multiple separate guest contexts. The VM is configured with specific limits, and excludes support for any subsequent WebAssembly specification revisions or proposals.

Furthermore to limit potential nondeterminism risks (see below), floating point instructions are prohibited and any WASM code that includes floating point instructions will not proceed past validation, but be rejected with an error.

Input guest code for a guest context is a single WASM module in the specified WASM binary format, and guest code will pass through all 4 semantic phases defined in the WASM specification: decoding, validation, instantiation and execution. See the linked specification for details.

Host environment

A new structure called a host environment is added to the transaction-processing subsystem of stellar-core. A host environment is a container carrying:

Zero or more WASM VMs.
Any host objects that guest code in a WASM VM can refer to.
Any resource-accounting mechanisms for guest code.
Any host functions that guest code in a WASM VM can import.

Interface

The interface between the host environment and guest code is very narrow and is defined by the WASM specification of embedding. A summary of some relevant aspects is repeated here:

Guest memory ("WASM linear memory") is separated from host memory. The host may have a mechanism to access guest memory, but the guest has no mechanism to access host memory.
There are exactly 4 types of data values shared between guest and host: i32, i64, f32, and f64. These are 32 and 64-bit 2s complement integers (with undefined "signedness") and 32 and 64-bit IEEE754 binary floating point values.
Guest code modules carry a list of exported functions (that the guest provides and the host can call) and a list of imported functions (that the host provides and the guest can call). Both imported and exported functions can only pass a sequence of parameters of the 4 shared data types and return a single value of the 4 shared data types, or a trap.
Various error conditions may result in a guest trap condition, which is a terminal state for the WASM VM running the guest code: no further VM execution can occur after it traps. A trap may be generated by guest code due to an execution error, or may be generated by a host function called from guest code. Therefore any call from guest to host or host to guest may produce a trap result rather than a value.

Lifecycles

A host environment has its own lifecycle: it is created before any of the host objects or VMs it contains, and destroyed after any of the host objects or VMs it contains.

When a host environment is created, it contains no host objects and no VMs.

Adding a WASM VM to a host environment involves passing WASM code through the 4 lifecycle phases in the WASM specification. If any phase fails, no further phases will be performed on the failed WASM VM.

Multiple WASM VMs can coexist in a single host environment. The intention is that one host environment and one WASM VM will be created for an "outermost" invocation of a smart contract, and that "inner" contracts can be invoked by guest code calling a host function that constructs an additional VM and invokes a guest function in that new VM, within the same shared host environment. The specific mechanism of calling between contracts is not specified in this CAP.

Multiple WASM VMs in the same host environment can refer to the same host objects: this is the mechanism for passing (immutable) information between different smart contracts.

Limits

TBD. Implementation-defined limits will be specified here before finalization of the CAP.

Additional implementation-defined limits will be specified to restrict the consumption of host resources by guest code. In particular, a step-counter or "gas limit" will be imposed on the number of instructions executed by guest code. Additionally any computation, memory or IO resources consumed by host functions called by guest code will be accounted-for. Any guest code that exceeds limits will terminate with an error.

Determinism

Both guest code and any part of the host environment controlled by guest code must execute deterministically in response to inputs, and must be sufficiently well-specified that replaying historical guest code in an upgraded host environment (i.e. a new version of stellar-core) will produce observably-identical results. This includes the result of observable resource exhaustion within host-controlled CPU or memory limits, which implies the need for careful resource accounting on all guest-controlled actions.

The WASM spec has carefully limited nondeterminism to a small set of cases, which we consider here:

New features: no WASM features beyond the 1.0 spec are supported by the smart contract system.
Threads: not supported by the smart contract system.
NaN-related behaviour for floating point: all floating point code is prohibited.
SIMD-related behaviour: all SIMD extensions are prohibited.
Environment-resource limit exhaustion: will be specified above.

Data Model

This CAP defines a data model shared between guest and host environments. It consists of a set of values and a set of objects:

Values can be packed into a 64-bit integer, and can therefore be easily passed back and forth between the host environment and guest code, as arguments or return values from imported or exported functions.
Objects (also called "host objects") exist only in host memory, in the host context, and can only be referenced by guest code through values containing handles that refer to objects. If guest code wishes to perform an operation on a host object, it must call a host function with values containing handles that refer to any host object(s) to operate on.

Immutability

Values and Objects are both immutable: they cannot be changed once created. Any operation on a host object that implies a modification of the object's state will allocate a new object with the modified state, and return a value that refers to the new object. Objects must therefore be relatively lightweight, and reuse shared substructures where possible.

Forms

The data model is specified in two separate forms:

In XDR, for inclusion in serial forms such as transactions and ledger entries.
In a set of "host types", of which the "host value type" is shared between host and guest.

The rationale for the two separate forms is given below, in the rationale section.

XDR changes

See the new XDR files in the Soroban overview CAP.

Host value type

The host value type is a 64-bit integer carrying a bit-packed disjoint union of several cases:

The least-significant bit differentiates between two primary cases:
- If it is 0, the remaining 63 bits encode a positive signed 64-bit integer.
- If it is 1, the remaining 63 bits encode a low 3-bit tag and a high 60-bit body.
The 8 tag values define an interpretation of the body, from least-significant to most-significant bits:
- Tag 0: a 32-bit unsigned integer followed by 30 zero bits.
- Tag 1: a 32-bit signed integer followed by 30 zero bits.
- Tag 2: a static set of 60-bit values, of which the first 3 are void (0), true (1) and false (2).
- Tag 3: an object reference given by a 28-bit type code followed by a 32-bit handle.
- Tag 4: a symbol having 10 or less 6-bit character codes drawn from the character repertoire [_0-9A-Za-z], with _ assigned code 1 and trailing positions in the symbol filled with a zero code, and code positions starting at the least significant 6 bits of the body.
- Tag 5: a bitset consisting of 60 1-bit flags.
- Tag 6: a status value consisting of a 28-bit type code followed by a 32-bit status code.
- Tag 7: reserved for future use.

Note that the tag numbers in the host value representation are not identical to the SCValType enumeration values used in the SCVal union. For example SCV_OBJECT is 4 whereas the host object tag value is 3. The difference arises from the fact that the host value type has a 2-level tagging scheme -- a 1-bit level followed by a 3-bit level -- whereas SCValType is has a single 32-bit level of tagging.

Host object type(s)

There are many different host object types, and we refer to the disjoint union of all possible host object types as the host object type. This may be implemented in terms of a variant type, an object hierarchy, or any other similar mechanism in the host.

Every host object is held in host memory and cannot be accessed directly from guest code. Host objects can be referred to by host values in either host or guest code: specifically those values with tag 3 (object reference) refer to a host object by type code and handle.

Host object handles are assigned sequentially from 1, as host objects are allocated during the lifecycle of a host execution context. Host object handle 0 is reserved as a sentinel value that always denotes an invalid object, on which no host functions are defined. All host object types share a single numerical range of handles. In other words: the type codes held in object references reflect type differences between host objects, to allow guests to switch on host object types without calling host functions to query them, but the object type codes do not subdivide the numeric range of object handles.

There are 2^28 (268,435,456) possible host object type codes, of which only the first 6 are defined in this CAP:

Object type 0: a box which contains a single host value.
Object type 1: a vector which contains a sequence of host values.
Object type 2: a map which is an ordered association from host values to host values.
Object type 3: an unsigned 64-bit integer.
Object type 4: an signed 64-bit integer.
Object type 5: a binary object containing unspecified bytes.

Note that unlike value tags, the host object type codes are the same numbers as the SCObjectType codes in the XDR form. That is, SCO_VEC has value 1 which is the same as the host object type code for vector. Maintaining common numbering limits SCObjectType to 2^28 possible values as well.

This CAP defines a basic comparison operation for these types, as well as validity and conversion operations for the XDR, but no other operations. An expanded repertoire of host object types and functions that operate on them will be presented in a later CAP.

Comparison

Values and objects in the data model have a total order. When comparing two values A and B:

If A is a positive int64 and B is not, A is less than B.
If A and B are both positive int64 values, they are ordered by the normal int64 order.
If A and B are both tagged and if A has a lesser tag than B, A is less than B.
If A and B are both equally tagged, then:
- If they have tag 0, they are ordered by the normal uint32 order on their low 32 bits.
- If they have tag 1, they are ordered by the normal int32 order on their low 32 bits.
- If they have tag 2, 5 or 6 or 7 they are ordered by the normal uint64 order on the zero-extension of their low 60 bits.
- If they have tag 4 they are ordered by the lexicographical order of their Unicode string value.
- If they have tag 3 they are ordered by calling obj_cmp(A, B) which performs deep object comparison.

Deep object comparison can be accessed by either guest or host: it is provided to guests as a host function via the host environment interface. It performs a recursive structural comparison of objects and values embedded in objects using the following rules:

If A and B have different object types, they are ordered by object type code.
If A and B are boxes, their values are ordered by the value rules above.
If A and B are vectors, they are ordered by lexicographic extension of the value order
If A and B are maps, they are ordered lexicographically as ordered vectors of (key, value) pairs
If A and B are int64 or uint64, they are ordered using the normal order for those types
If A and B are binary, they are ordered using the lexicograhical order of their respective bytes

Validity

The following additional validity constraints are imposed on the XDR types. Values not conforming to these constraints are rejected during conversion to host form:

SCVal.pos_i64 must be >= 0.
SCVal.sym must consist only of the characters [_0-9A-Za-z]
SCVal.obj must not be empty (it is optional in the XDR only to enable type-recursion)
SCVal.bits must have its most significant 4 bits set to 0.

Conversion

Conversion from an XDR SCVal to a host value is as follows:

Type cases other than SCV_OBJECT are direcly encoded into their bit-packed host value form.
For the SCV_OBJECT case, the contained SCObject is converted into a host object and placed in the host environment's host object array at the next available position P. The resulting host value is object handle P.

Conversion from a host value to an XDR SCVal is as follows:

For the bit-packed primary case 0, and for cases other than tag 3 (object) in primary case 1, each bit-packed representation is copied directly to its corresponding SCVal case.
For tag 3 in primary case 1, the object handle value is accessed in the host environment's host object array. If the object handle has a value beyond the end of the host object array, the conversion fails with an error. Otherwise the result of conversion is an SCVal in SCV_OBJECT state, with the conversion of the located host object assigned to SCVal.obj.

Conversion from an XDR SCObject to a host object is as follows:

Type case SCO_BOX forms a host box containing the conversion of the contained SCVal.
Type case SCO_VEC forms a host vector containing the conversion of each contained SCVal in order.
Type case SCO_MAP forms an ordered host map and then for each pair SCMapEntry, adds an entry mapping the conversion of the map entry's key to the conversion of the map entry's val, returning the resulting host map once all SCMapEntrys are added. Note that this means that the resulting map will be in comparison order rather than the order SCMapEntrys were provided, and any redundant entries for the same key earlier in the array of SCMapEntrys will be overwritten by later entries for the same key.
Type cases SCO_U64, SCO_I64 and SCO_BINARY simply move their contained value into a host object with the same content, unaltered.

Conversion from a host object to an XDR SCObject is as follows:

A host box object forms an SCObject of type SCO_BOX with the conversion of its value in SCObject.box.
A host vector object forms an SCObject of type SCO_VEC with the conversions of its element values in SCObject.vec.
A host map object forms an SCObject of type SCO_MAP with each mapping entry converted to an SCMapEntry and added to the resulting SCObject.map field in host value comparison order, from low to high.
A signed or unsigned int64 object, or binary object, is simply moved to its respective SCObject case.

Note that due to the re-ordering and de-duplication that occurs when converting an SCO_MAP SCObject, it is not the case that "round trip" conversions from XDR to host forms produce identical results.

Design Rationale

Rationale for WASM

WebAssembly was chosen as a basis for this CAP after extensive evaluation of alternative virtual machines. See "choosing wasm" for details, or the underlying stack selection criteria document.

Relative to requirements listed in this CAP, WASM addresses many of them:

Secure:
- Resource limits: WASM has good (though not ideal) mechanisms for enforcing resource limits.
- VM escape and side channels: WASM is designed as a secure sandbox and has a good security track record so far.
Well-defined:
- WASM has a rigorous formal semantics and conformance testsuite, it is well specified.
- WASM's nondeterminism is narrowly circumscribed and this CAP excludes all cases.
Performance:
- Code size: WASM code is compact but low level, risks being large. The host-centric data model in this CAP minimizes code size.
- Optimization: stock compilers emit efficient WASM code.
Interoperable:
- Multi-language: many PLs have at least preliminary WASM target support, though only a few are mature enough to use.
- Tool maturity: languages targeting WASM -- especially Rust -- have high quality, mature tools.
Simple:
- Non-novelty: WASM is a complete, mature, well-supported spec with many off-the-shelf implementations to choose from.
- Compatibility: many WASM interpreters are written in C++ and/or Rust, can be embedded easily in stellar-core.
- Learnability: WASM is not as familiar as EVM but is relatively widely known and appears easy to learn.

Rationale for value / object split

The split between values (which can traverse the host/guest interface) and objects (which remain on the host side and are managed by host functions) is justified as a response to a number of observations we made when considering existing blockchains:

Many systems spend a lot of guest code footprint (time and space) implementing data serialization and deserialization to and from opaque byte arrays. This code suffers from a variety of problems:
- It is often to and from an opaque format, making a contract's data difficult to browse or debug, and making SDKs that invoke contracts need to carry special code to serialize and deserialize data for the contract.
- It is often coupled to a specific version or layout of a data structure, such that data cannot be easily be migrated between versions of a contract.
- It requires that a contract potentially contains extra copies of serialization support code for the formats used by any contracts it calls.
- It is often intermixed with argument processing and contract logic, representing a significant class of security problems in contracts.
- It is usually unshared code: each contract implements its own copy of serialization and deserialization, and does so inefficiently in the guest rather than efficiently on the host.
Similarly, when guest code is CPU-intensive it is often performing numerical or cryptographic operations which would be better supported by a common library of efficient (native) host functions.
As of this writing, WASM defines no mechanism of directly sharing code, which makes it impossible to reuse common guest functions needed by many contracts. Sharing common host functions is comparatively straightforward, and much more so if we define a common data model on which host functions operate.
The more time is spent in the guest, the more the overall system performance depends directly on the speed of the guest VM's bytecode-dispatch mechanism (a.k.a. the VM's "inner loop"). By contrast, if the guest VM spends most of its time making a sequence of host calls, the bytecode-dispatch speed of the guest VM is less of a concern. This gives us much more flexibility in choice of VM, for example to choose simple, low-latency and comparatively-secure interpreters rather than complex, high-latency and fragile JITs.

Some systems mitigate these issues by providing byte-buffers of data to guests in a guaranteed input format, such as JSON. This eliminates some of the interoperability concerns but none of the efficiency concerns: the guest still spends too much time parsing input and building data structures.

Ultimately we settled on an approach in which the system will spend as little time in the guest as possible, and will furnish the guest with a rich enough repertoire of host objects that it should not need many or any of its own guest-local data structures. We expect that many guests will be able to run without a guest memory allocator at all.

There are various costs and benefits to this strategy. We compared in detail to many other blockchains with different approaches before settling on this one.

Costs:

Larger host-object API attack surface to defend.
Larger host-object API compatibility surface to maintain.
More challenging task to quantify memory and CPU costs.
More specification work to do defining host interface.
Risks redundant work, guest may choose to ignore host objects.

Benefits:

Much faster execution due to most logic being in C++.
Smaller guest input-parsing attack surfaces to defend.
Smaller guest data compatibility surfaces to maintain.
Much smaller guest code, minimizing storage and instantiation costs:
- Little or no code to serialize or deserialize data in guest.
- Little or no common memory-management or data structure code in guest.
Auxiliary benefits from common data model:
- Easier to browse contract data by 3rd party tools.
- Easier to debug contracts by inspecting state.
- Easier to test contracts by generating / capturing data.
- Easier to pass data from one contract to another.
- Easier to use same data model from different source languages.

It is especially important to note that the (enlarged) attack and maintenance surfaces on the host are costs borne by stellar-core developers, while the (diminished) attack and maintenance surfaces are benefits that accrue to smart contract developers. We believe this is a desirable balance of costs and benefits.

Rationale for value and object type repertoires

These are chosen based on two criteria:

Reasonably-foreseeable use in a large number of smart contracts.
Widely-available implementations with efficient immutable forms.

In addition, values are constrained by the ability to be packed into a 64-bit tagged disjoint union. Special cases for common small values such as symbols, booleans, 32-bit integers, status codes and small bitsets are provided on the basis of presumed utility in a variety of contexts.

The value tagging scheme is arranged into two levels -- an primary single-bit tag followed by a secondary 3-bit tag in one of the two primary cases -- in order to facilitate storing positive 64-bit integers in one of the primary cases, without overflowing to an object. We observe that the majority of 64-bit values in the current ledger are positive, representing (for example) asset amounts, time points and sequence numbers.

Implementations of the map and vector object types are based on design techniques from the functional language community, specifically Relaxed-Radix-Balanced vectors (RRBs) and Hash Array Mapped Tries (HAMTs). Both of these data types support efficient "modifying copies" that produce new data structures from updates applied to old ones, while sharing most of the memory and substructure of the old object with the new one.

Rationale for separate XDR and host forms

It would be possible to store all data in memory in the host in its XDR format, but we choose instead to define a separate "host form" for both values and objects in this specification for the following reasons:

In the host form, values are bit-packed in order to fit in exactly 64 bits. This bit-packing is implemented in stellar-core but is somewhat delicate and would be undesirable to reimplement in every client SDK and data browser. In the XDR form, the various cases that make up the value union are represented in a standard XDR union, which is automatically supported by many languages' XDR bindings.
In the host form, objects and values are separated for reasons explained above, and their separation is mediated through object references and the host environment that maps references to objects. In the XDR form, objects and values are not separated, because they should not be: there is no implicit context in which to resolve references, and even if there were it would introduce a new category of potential reference-mismatch error in the serialized form to support it. Instead, in the XDR form values directly contain objects.
In the host form, maps and vectors are implemented using memory-efficient substructure-sharing datatypes as described above. Additionally, maps support CPU-efficient hashed lookup by key. In the XDR form, maps are simple linear arrays of key-value pairs, and neither vectors nor maps support any sort of partial substructure-sharing updates.

Rationale for immutable objects

We considered the potential costs and benefits of immutable objects, and decided in favor of them.

Costs:

More memory allocation.
Risk of referring to an old/stale object rather than a fresh/new one.

Benefits:

Reduced risk of error through mutating a shared object.
Simple model of equality, for using structured values as map keys.
Simple model of security: no covert channels, only passed values.
Simple model for transactions: discard objects on rollback.

Since we expect smart contracts to run to completion very quickly, and then free all objects allocated, we do not consider the additional memory allocation cost a likely problem in practice. Furthermore as mentioned in the object-repertoire rationale above, we have been using shared-substructure types in our prototype, so most large-object updates should only consume minimal new memory.

Therefore the only real risk we foresee is the increased risk of unintentionally referring to an old/stale object, and we believe this is outweighed by the reduced risk of unintentionally referring to a shared mutable object that it mutated through an alias.

Protocol Upgrade Transition

This CAP does not introduce any protocol changes.

Backwards Incompatibilities

This CAP does not introduce any backward incompatibilities.

Resource Utilization

TBD. Performance evaluation is ongoing on in-progress implementation.

Security Concerns

In order to describe the security implications of this CAP we use the STRIDE methodology. This is a common framework used in the industry to identify security threats. For each categories we use attack scenarios to better explain the threat.

Spoofing: Attackers are able to let the system believe they are privileged users
- A logical vulnerability exists in the WASM code of the smart contract and lets a standard user perform privileged tasks
- A logical vulnerability exists in a host function and leads to a failure in access control checks
Tampering: Attackers are able to modify unauthorized data in the ledger database
- A write-anywhere vulnerability exists in the WASM interpreter. A specially crafted WASM code triggers this bug and lets a user write custom data in the host memory which then get reflected in the database
- A write-anywhere vulnerability exists in a host function. A smart-contract code calls the vulnerable host function and triggers the vulnerability. A user calls the smart-contract and uses it to write custom data in the host memory or directly in the database
- A logical vulnerability exists in the implementation of the serialization and deserialization of the data model. A smart-contract code instantiates specific objects on the host side and triggers the vulnerable part of the serializer to tamper with the data saved in the database
Repudiation: Not applicable here
Information disclosure: Attackers are able to access unauthorized information on the validators (secret seed for example), on the ledger database (other smart contract data) or guest memory data from another contract:
- A read-anywhere vulnerability exists in the WASM interpreter. A specially crafted WASM code triggers this vulnerability and lets a user read custom data in the host memory
- A read-anywhere vulnerability exists in a host function. A smart-contract code calls the vulnerable host function and triggers the vulnerability. A user calls the smart-contract and uses it to read custom data in the host memory
- During a smart contract execution a function from another smart contract is called. This call exploits a read-anywhere vulnerability in the access control checks of new contract data. This result in the caller contract being able to programmatically access the data of the callee contract. This is an issue for contracts like Oracles.
Denial of Service: Network halts because consensus cannot be reached
- A logical vulnerability exists in the implementation which validates that only deterministic WASM code is executed. A specially crafted WASM code triggers this vulnerability and creates nondeterminism accross the network
- A logical vulnerability exists in the implementation which compute the amount of gas needed to execute a smart-contract code. A smart-contract code exploits this vulnerability and requires too many computing resources for the validators, preventing them to close the ledger in an acceptable time frame
Elevation of privilege: Attackers are able to execute non authorized code on the validators
- A code execution vulnerability exists in the WASM interpreter. A specially crafted WASM code triggers this vulnerability and lets a user execute code within the host context (stellar-core process)
- A code execution vulnerability exists in a host function. A smart-contract code calls the vulnerable host function and triggers the vulnerability. A user calls the smart-contract and uses it to execute code within the host context (stellar-core process)

Test Cases

TBD. See in-progress implementation.

Implementation

An implementation is provided in two parts:

The rs-stellar-contract-env repository which contains three Rust crates defining: - stellar-contract-env-host: a Rust implementation of the host environment - stellar-contract-env-guest: a Rust interface for Rust guest code to interact with the host environment - stellar-contract-env-common: a set of definitions common to both
The PR 3428 on the stellar-core repository, which provides the XDR definitions above and provides a connection between stellar-core and the rs-stellar-contract-host crate.

Files

cap-0046-01.md

Latest commit

History