CAP: 0052
Title: Smart Contract Host Functionality: Base64 Encoding/Decoding
Working Group:
Owner: Leigh McCulloch <@leighmcculloch>
Authors: Leigh McCulloch <@leighmcculloch>
Consulted:
Status: Rejected
Created: 2023-02-15
Discussion: TBD
Protocol version: TBD
Support base64 encoding/decoding in Soroban contracts via the exported host interface.
Base64 encoding/decoding is a common encoding of binary data, especially within data formatted as JSON. Encoding and decoding base64 is something that can be done in contract code under the limits known today, relatively efficiently.
For example, a single small encode of 32 bytes to 43 bytes is about 200,000 instructions and about 460 bytes of additional code.
Even though base64 encoding can be done relatively easily for the case of small data, in applications that need to encode or decode larger amounts of data it quickly become a larger percentage of an applications work.
Additionally, even if efficient, this is a common standard that is simple to encode host side and available in many standard libraries today.
An expected use of the base64 encode/decode is in the implementation of a webauthn contract, where base64 url encoding is used.
This CAP is aligned with the following Stellar Network Goals:
- The Stellar Network should make it easy for developers of Stellar projects to create highly usable products
This proposal adds two functions to the Soroban environment's exported interface that accepts the parameters for configuring a variant of base64, and encode and decode bytes to and from base64 as specified in RFC4648.
A new function base64_encode
with export name n
in module b
is added to
the Soroban environment's exported interface. It accepts an alphabet
,
padding
, and returns the bytes encoded in base64.
Another new function base64_decode
with export name o
in module b
is added
to the Soroban environment's exported interface. It accepts an alphabet
,
padding
, and returns the bytes decoded. If the decode fails for any reason it
returns an ScError
.
For both functions the parameters behave the same.
Parameter alphabet
specifies the alphabet used by the encoder and decoder. The
symbols std
or url
are accepted, where they map respectively to the standard
alphabet and the URL and filename friendly alphabet as specified in RFC4648.
Parameter padding
specifies the padding character with the same limitations on
valid values as specified in RFC4648. The padding character is 8-bit, and is
stored in the lower 8-bits of the U32Val type. If the padding value of
0xffffffff is used it indicates no padding is to be used when encoding, and no
padding is required when decoding, but padding is permitted, unless strict is
specified.
Parameter strict
specifies for decoding whether incomplete, non-existent, or
present-when-unspected padding should be considered an error. If strict is
specified then any padding that doesn't exactly match the padding that's
required would cause an error.
--- a/soroban-env-common/env.json
+++ b/soroban-env-common/env.json
@@ -1890,6 +1890,38 @@
],
"return": "U32Val",
"docs": "Return the index of a Symbol in an array of linear-memory byte-slices, or trap if not found."
+ },
+ {
+ "export": "n",
+ "name": "base64_encode",
+ "args": [
+ {
+ "name": "input",
+ "type": "BytesObject"
+ },
+ {
+ "name": "alphabet",
+ "type": "Symbol"
+ },
+ {
+ "name": "padding",
+ "type": "U32Val"
+ }
+ ],
+ "return": "BytesObject",
+ "docs": "Base64 encodes as specified in RFC4648 the input BytesObject, using the alphabet specified as a symbol 'std' or 'url' that respectively encode with the standard alphabet or the URL/filename-safe alphabet, padded with the u8 byte stored in the low 8-bits of the padding U32Val, or not padded if the padding is 0xffffffff, returning a BytesObject with the encoded value."
+ },
+ {
+ "export": "o",
+ "name": "base64_decode",
+ "args": [
+ {
+ "name": "input",
+ "type": "BytesObject"
+ },
+ {
+ "name": "alphabet",
+ "type": "Symbol"
+ },
+ {
+ "name": "padding",
+ "type": "U32Val"
+ },
+ {
+ "name": "strict",
+ "type": "Bool"
+ }
+ ],
+ "return": "BytesObject",
+ "docs": "Base64 decodes as specified in RFC4648 the input BytesObject, using the alphabet specified as a symbol 'std' or 'url' that respectively encode with the standard alphabet or the URL/filename-safe alphabet, expecting padding with the u8 byte stored in the low 8-bits of the padding U32Val, or not padded if the padding is 0xffffffff, returning a BytesObject with the encoded value. If strict is specified, decoding fails if trailing padding is not exactly the number of bits required to pad. If decoding fails, the function returns ScError."
}
]
},
The alphabet's included are commonly used. Standard base64 encoding is most commonly used in applications. Base64 url encoding is less commonly used, but is used in the webauthn standard.
No custom alphabets are supported because their use is rare, however support for
custom alphabets could be added in the future by allowing for a 64 byte String
or Bytes value to be passed as the alphabet instead of a Symbol. The use of
Symbol to signal std
or url
is compact and efficient for known alphabets and
makes those selections distinct from the provision of a custom String alphabet
in the future.
The base64 encode/decode interface is largely motivated by the webauthn use case. Webauthn involves an application and device holding a key and using that key for authentication and authorization. The messages that Webauthn signs are JSON and contain a base64 url encoded challenge value.
To implement Webauthn on Stellar in Soroban custom account contracts the challenge would be the hash of an authorization entry. Webauthn base64 url encodes it and the contract that does verification would either need to be able to decode the challenge value to compare with the value it expects, or be able to encode the value it expects to compare with the encoded value found in the JSON message.
For example, a client data JSON:
{
"type":"webauthn.get",
"challenge":"hJHFvaaoU7qkcH9kML46shLL_btpYGCA6ty3ie0M1Qw",
"origin":"http://localhost:4507",
"crossOrigin":false
}
This proposal is completely backwards compatible.
The new functions must be appropriately metered.
Some applications that use base64 require that their base64 decoders are strict in regards to the presence or non-presence of padding. Applications that fall into this category may be applications that need to ensure that data encoded always has a single representation in the encoded form. For this reason it is relatively important that the variants supported include the ability to require that decoding fail if indicated by the application that strict decoding is required.
None yet.
None yet.