Skip to content

Latest commit

 

History

History
1221 lines (997 loc) · 38.4 KB

HowtoBuffers2x.md

File metadata and controls

1221 lines (997 loc) · 38.4 KB

How to work with buffers in Duktape 2.x

This page applies to Duktape 2.x only.

Introduction

Overview of buffer types

Buffer type Standard C API type ECMAScript type Description
Plain buffer No
Duktape specific
DUK_TAG_BUFFER [object Uint8Array] Plain memory-efficient buffer value (not an object). Mimics an Uint8Array for most ECMAScript behavior, separate type in C API. Object coerces to an actual Uint8Array instance. Has virtual index properties. (Behavior changed in Duktape 2.x.)
ArrayBuffer object Yes
ES2015
DUK_TAG_OBJECT [object ArrayBuffer] Standard object type for representing a byte array. Has additional non-standard virtual index properties.
DataView, typed array objects Yes
ES2015
DUK_TAG_OBJECT [object Uint8Array], etc Standard view objects to access an underlying ArrayBuffer.
Node.js Buffer object No
Node.js-like
DUK_TAG_OBJECT [object Uint8Array] Object with Node.js Buffer API. Currently tracked Node.js version is 6.7.0; Duktape tracks the latest with some delay.

ArrayBuffer and typed arrays recommended

New code should use ES2015 ArrayBuffers and typed arrays (such as Uint8Array) unless there's a good reason for doing otherwise. Here's one tutorial on getting started with typed arrays:

ArrayBuffer encapsulates a byte buffer. Typed array objects are views into an underlying ArrayBuffer, e.g. Uint32Array provides a virtual array which maps to successive 32-bit values in the underlying array. Typed arrays have host specific endianness and have alignment requirements with respect to the underlying buffer. DataView provides a set of accessor for reading and writing arbitrarily aligned elements (integers and floats) in an underlying ArrayBuffer; endianness can be specified explicitly so DataView is useful for e.g. file format manipulation.

Plain buffers for low memory environments

For very low memory environments plain buffers can be used in places where an Uint8Array would normally be used. Plain buffers mimic Uint8Array behavior quite closely for ECMAScript code so often only small ECMAScript code changes are needed when moving between actual Uint8Arrays and plain buffers. C code needs to be aware of the typing difference, however.

Plain buffers only provide an uint8 access into an underlying buffer. Plain buffers can be fixed, dynamic (resizable), or external (point to user controlled buffer outside of Duktape control). A plain buffer value object coerces to a Uint8Array object, similarly to how a plain string object coerces to String object.

When buffer object support is disabled in Duktape configuration, only plain buffers will be available. They will inherit from Uint8Array.prototype which is reachable via plain buffer values (e.g. buf.__proto__) but isn't registered to the global object. All the typed array methods will be absent; the intent is to mostly work with the buffers from C code.

Node.js Buffer bindings

Node.js Buffer bindings are useful when working with Node.js compatible code.

Node.js Buffer provides both a uint8 virtual array and a DataView-like set of element accessors, all in a single object. Since Node.js is not a stable specification like ES2015, Node.js Buffers are more of a moving target than typed arrays.

Buffer type mixing supported but not recommended

Because the internal data type for all buffer objects is the same, they can be mixed to some extent. For example, Node.js Buffer.concat() can be used to concatenate any buffer types. However, the mixing behavior is liable to change over time so you should avoid mixing unless there's a clear advantage in doing so.

Changes going forward

The most likely development direction for future releases is to:

  • Follow ES2015+ more and more closely for buffer semantics.

  • Make the standard types more memory and performance efficient.

  • Eliminate the C API distinction between plain buffers and typed array objects.

Useful references

API summary

Creating buffers

Type C ECMAScript Notes
plain buffer duk_push_buffer()
duk_push_fixed_buffer()
duk_push_dynamic_buffer()
duk_push_external_buffer()
Uint8Array.allocPlain()
Uint8Array.plainOf()
Uint8Array.plainOf() gets the underlying plain buffer from any buffer object without creating a copy. Slice offset/length information is lost.
ArrayBuffer object duk_push_buffer_object() new ArrayBuffer()  
DataView object duk_push_buffer_object() new DataView()  
Typed array objects duk_push_buffer_object() new Uint8Array()
new Int32Array()
new Float64Array()
etc
 
Node.js Buffer object duk_push_buffer_object() new Buffer()  

When a typed array is created an ArrayBuffer object is also created and available as the .buffer property of the typed array. Duktape 2.0 creates the ArrayBuffer when the typed array is created, but Duktape 2.1 creates the ArrayBuffer lazily on first read of the .buffer property.

Type checking buffers

Type C ECMAScript Notes
plain buffer duk_is_buffer()
duk_is_buffer_data()
n/a
ArrayBuffer object duk_is_buffer_data() v instanceof ArrayBuffer  
DataView object duk_is_buffer_data() v instanceof DataView  
Typed array objects duk_is_buffer_data() v instanceof Uint8Array, ...  
Node.js Buffer object duk_is_buffer_data() Buffer.isBuffer()  

Accessing buffer data

Type C ECMAScript Notes
plain buffer duk_get_buffer()
duk_require_buffer()
buf[0], buf[1], ...
buf.length
buf.byteLength
buf.byteOffset
buf.BYTES_PER_ELEMENT
Non-standard type. The .buffer property returns an ArrayBuffer spawned on-the-fly (new instance for every access).
ArrayBuffer object duk_get_buffer_data()
duk_require_buffer_data()
new Uint8Array(buf)[0], ...
buf.byteLength
No direct access to the underlying buffer. Access the buffer via a typed array view, e.g. Uint8Array.
DataView object duk_get_buffer_data()
duk_require_buffer_data()
view.getInt16()
view.setUint32()
...
view.byteLength
view.byteOffset
The .buffer property contains the ArrayBuffer the view operates on. The property is lazy; the ArrayBuffer is created on the first access and remains the same afterwards.
Typed array objects duk_get_buffer_data()
duk_require_buffer_data()
view[0], view[1], ...
view.length
view.byteLength
view.byteOffset
view.BYTES_PER_ELEMENT
The .buffer property contains the ArrayBuffer the view operates on. The property is lazy; the ArrayBuffer is created on the first access and remains the same afterwards.
Node.js Buffer object duk_get_buffer_data()
duk_require_buffer_data()
buf[0], buf[1], ...
buf.length
buf.byteLength
buf.byteOffset
buf.BYTES_PER_ELEMENT
In Node.js v6.7.0+ a Buffer is implemented as an Uint8Array with a custom prototype object.

Configuring buffers

Type C ECMAScript Notes
plain buffer duk_config_buffer()
duk_resize_buffer()
duk_steal_buffer()
n/a Fixed plain buffers cannot be configured. Dynamic plain buffers can be resized and their current allocation can be "stolen". External plain buffers can be reconfigured to map to a different memory area.
ArrayBuffer object n/a n/a After creation, ArrayBuffer objects cannot be modified. However, their underlying plain buffer can be reconfigured (depending on its type).
DataView object n/a n/a After creation, DataView objects cannot be modified. However, their underlying plain buffer can be reconfigured (depending on its type).
Typed array objects n/a n/a After creation, typed array objects cannot be modified. However, their underlying plain buffer can be reconfigured (depending on its type).
Node.js Buffer object n/a n/a After creation, Node.js Buffer objects cannot be modified. However, their underlying plain buffer can be reconfigured (depending on its type).

Buffer-to-string conversion

Call Description
duk_buffer_to_string() Buffer data is used 1:1 as internal string representation. If you want to create valid ECMAScript strings, data should be in CESU-8 encoding. It's possible to create symbol values (intentionally or by accident). Using duk_push_lstring() for the buffer data is equivalent.
new TextDecoder().decode(buf) Decodes a buffer as a UTF-8 string and outputs a valid ECMAScript string. Invalid byte sequences are replaced with U+FFFD, non-BMP characters are replaced with surrogate pairs.
duk_to_string() Not very useful: invokes ECMAScript ToString() coercion which results in strings like [object Uint8Array].
String(buf) Not very useful: invokes ECMAScript ToString() coercion as for duk_to_string().

String-to-buffer conversion

Call Description
duk_to_buffer() Bytes from the string internal representation are copied byte-for-byte into the result buffer. For valid ECMAScript strings the result is CESU-8 encoded.
new TextEncoder().encode(str) The string internal representation is decoded as extended CESU-8/UTF-8 and then encoded into UTF-8. Surrogate pairs are combined, and invalid byte sequences are replaced with U+FFFD.
new Buffer(str) The string is treated the same as with TextEncoder.
Uint8Array.allocPlain(str) String internal representation is copied byte-for-byte into the resulting buffer as for duk_to_buffer().

String/buffer conversion use cases

Conversion C ECMAScript Notes
Buffer-to-string UTF-8 n/a new TextDecoder().decode(buf) Buffer is interpreted as UTF-8, invalid UTF-8 sequences are replaced with U+FFFD, non-BMP codepoints are expanded into surrogate pairs.
Buffer-to-string CESU-8 n/a n/a Buffer is interpreted as CESU-8, no bindings for this now.
Buffer-to-string 1:1 duk_buffer_to_string() n/a Buffer is converted byte-for-byte into the internal (extended CESU-8/UTF-8) representation without decoding. This coercion can also result in a symbol value.
String-to-buffer UTF-8 n/a new TextEncoder().encode(str) String is converted from a 16-bit codepoint list to UTF-8. Valid surrogate pairs are combined, invalid surrogate pairs and invalid byte sequences are replaced with U+FFFD.
String-to-buffer CESU-8 n/a n/a No bindings for this now.
String-to-buffer 1:1 duk_to_buffer() n/a String is converted byte-for-byte from the internal representation into a buffer. For valid ECMAScript strings the result is valid CESU-8 which is used as their internal representation.

Plain buffers

A plain buffer value mimics an Uint8Array instance, and has virtual properties:

// Create a plain buffer of 8 bytes.
var plain = Uint8Array.allocPlain(8);  // Duktape custom call

// Fill it using index properties.
for (var i = 0; i < plain.length; i++) {
    plain[i] = 0x41 + i;
}

// Print other virtual properties.
print(plain.length);             // -> 8
print(plain.byteLength);         // -> 8
print(plain.byteOffset);         // -> 0
print(plain.BYTES_PER_ELEMENT);  // -> 1

// Because a plain buffer doesn't have an actual property table, new
// properties cannot be added (this behavior is similar to a plain string).
plain.dummy = 'foo';
print(plain.dummy);              // -> undefined

// Duktape JX format can be used for dumping
print(Duktape.enc('jx', plain)); // -> |4142434445464748|

// Plain buffers mimic Uint8Array behavior where applicable, e.g.
print(typeof plain);             // -> object, like Uint8Array
print(String(plain));            // -> [object Uint8Array], like Uint8Array

Uint8Array is the "object counterpart" of a plain buffer. It wraps a plain buffer, similarly to how a String object wraps a plain string. Uint8Array also has the same virtual properties, and since it has an actual property table, new properties can also be added normally.

You can easily convert between the two:

// Create an 8-byte plain buffer.
var plain1 = Uint8Array.allocPlain(8);

// Convert a plain buffer to a full Uint8Array, both pointing to the same
// underlying buffer.
var u8 = Object(plain1);

// Get the plain buffer wrapped inside a Uint8Array.
var plain2 = Uint8Array.plainOf(u8);  // Duktape custom call

// No copies are made of 'plain1' in this process.
print(plain1 === plain2);  // -> true

Plain buffers have an inherited .buffer property (a getter) which returns an ArrayBuffer backing to the same plain buffer. Because there's no property table, each .buffer read creates a new ArrayBuffer instance, so avoid reading the property over and over again. The .buffer property allows one to create another view over the plain buffer without making a copy:

var plain = Uint8Array.allocPlain(8);

// A typed array constructor interprets a plain array like a Uint8Array:
// it gets treated as an initializer array so that a copy is made.  Here,
// when constructing a Uint16Array, each input byte expands to 16 bits.
var u16 = new Uint16Array(plain);  // no shared storage

// Using .buffer allows a shared view to be created.  Here, a two-element
// Uint32Array is created over the 8-byte plain buffer.
var u32 = new Uint32Array(plain.buffer);  // shared storage

To summarize, the main differences between a plain buffer and an Uint8Array are:

  Plain buffer Uint8Array Notes
Creation Uint8Array.allocPlain(length)
Uint8Array.allocPlain('stringValue')
Uint8Array.allocPlain([ 1, 2, 3, 4 ])
new Uint8Array(length)
new Uint8Array([ 1, 2, 3, 4 ])
Uint8Array.allocPlain() has more argument variants, and strings are treated specially (string internal representation is copied 1:1 into the buffer). C API can of course also be used for buffer creation.
typeof object object  
Object.prototype.toString() [object Uint8Array] [object Uint8Array]  
instanceof Uint8Array true true  
Property table No Yes Plain buffer has no property table but inherited from Uint8Array.prototype. Property writes are usually ignored, but e.g. an inherited setter can capture a write.
.buffer property Yes Yes Plain buffer has an inherited .buffer getter which returns an ArrayBuffer which backs to the plain buffer. Each read creates a new ArrayBuffer instance.
Allow finalizer No Yes Even though plain buffers inherit from Uint8Array.prototype, a finalizer is not supported, even if the finalizer was inherited.
Object.isExtensible() false true  
buf.subarray() result Uint8Array Uint8Array The result of a .subarray() on a plain buffer is a Uint8Array object because a plain buffer cannot express a slice offset.

Other notes:

  • Plain buffers behave like Uint8Arrays when passed as an argument to built-in typed array bindings. In many cases the internal implementation will first promote the plain buffer to a (temporary) Uint8Array object which is used for the operation; the temporary is then thrown away. This affects performance when using plain buffers with some ECMAScript bindings.

  • Duktape built-ins like Duktape.dec() create plain buffers to save memory space; if you explicitly wish to work with Uint8Array objects you can e.g. use Object(Duktape.dec('hex', 'deadbeef')).

JSON and JX serialization

The Node.js Buffer type has a .toJSON() method so it gets serialized in standard JSON.stringify():

var buf = new Buffer('ABCD');
print(JSON.stringify(buf));

// Output:
// {"type":"Buffer","data":[65,66,67,68]}

ArrayBuffer doesn't have any enumerable own properties and no .toJSON() so they serialize as empty objects (same applies to DataView):

var buf = Duktape.dec('hex', 'deadbeef');
print(JSON.stringify([ 1, buf, 2 ]));

// Output:
// [1,{},2]

Plain buffers and typed arrays have enumerable index properties but no .toJSON() so they get serialized as objects (not arrays):

var plain = Uint8Array.allocPlain('foo');
var u16 = new Uint16Array([ 0x1111, 0x2222, 0x3333 ]);
print(JSON.stringify({ plain: plain, u16: u16 }));

// Output:
// {"plain":{"0":102,"1":111,"2":111},"u16":{"0":4369,"1":8738,"2":13107}}

You can of course add a .toJSON() yourself:

Uint8Array.prototype.toJSON = function (v) {
    var res = [];
    var nybbles = '0123456789abcdef';
    var u8 = this;
    for (var i = 0; i < u8.length; i++) {
        res[i] = nybbles[(u8[i] >> 4) & 0x0f] +
                 nybbles[u8[i] & 0x0f];
    }
    return res.join('');
};
var u8 = new Uint8Array([ 0x41, 0x42, 0x43, 0x44 ]);
print(JSON.stringify({ myBuffer: u8 }));

// Output:
// {"myBuffer":"41424344"}

Duktape JX format supports all buffer objects directly, encoding them like plain buffers unless a .toJSON() method exists:

var u8 = new Uint8Array([ 0x41, 0x42, 0x43, 0x44 ]);
print(Duktape.enc('jx', { myBuffer: u8 }));

// Output:
// {myBuffer:|41424344|}

JX respects slice information:

var u8a = new Uint8Array([ 0x41, 0x42, 0x43, 0x44 ]);
var u8b = u8a.subarray(2);
print(Duktape.enc('jx', { myBuffer: u8a, mySlice: u8b }));

// Output:
// {myBuffer:|41424344|,mySlice:|4344|}

Node.js Buffers, having a .toJSON(), will still serialize like with JSON.stringify() because .toJSON() takes precedence (at least in Duktape 2.x) over JX built-in buffer serialization.

Using buffers in C code

Typing

Plain buffers and buffer objects work a bit differently in the C API:

  • Plain buffer stack type is DUK_TYPE_BUFFER and they test true for both duk_is_buffer() and duk_is_buffer_data().

  • Buffer object stack type is DUK_TYPE_OBJECT and they test false for duk_is_buffer(), but true for duk_is_buffer_data().

This mimics how strings currently work in the API: String object also have the DUK_TYPE_OBJECT type tag and test false for duk_is_string(). However, this will probably change at a later time so that plain buffers and buffer objects (and plain strings and String objects) can be used interchangeably.

Plain buffers

Working with a plain fixed buffer

A fixed buffer cannot be resized after its creation, but it is the most memory efficient buffer type and has a stable data pointer. To create a fixed buffer:

unsigned char *ptr;

ptr = (unsigned char *) duk_push_fixed_buffer(ctx, 256 /*size*/);

/* You can now safely read/write between ptr[0] ... ptr[255] until the
 * buffer is collected.
 */

Working with a plain dynamic buffer

A dynamic buffer can be resized after its creation, but requires two heap allocations to allow resizing. The data pointer of a dynamic buffer may change in a resize, so you must re-lookup the data pointer from the buffer may have been resized. Safest approach is to re-lookup right before accessing:

unsigned char *ptr;
duk_size_t len;

/* Create a dynamic buffer, can be resized later using
 * duk_resize_buffer().
 */
ptr = (unsigned char *) duk_push_dynamic_buffer(ctx, 64 /*size*/);

/* You can now safely read/write between ptr[0] ... ptr[63] until a
 * buffer resize (or garbage collection).
 */

/* The buffer can be resized later.  The resize API call returns the new
 * data pointer for convenience.
 */
ptr = (unsigned char *) duk_resize_buffer(ctx, -1, 256 /*new_size*/);

/* You can now safely read/write between ptr[0] ... ptr[255] until a
 * buffer resize.
 */

/* You can also get the current pointer and length explicitly.
 * The safest idiom is to do this right before reading/writing.
 */
ptr = (unsigned char *) duk_require_buffer(ctx, -1, &len);

/* You can now safely read/write between [0, len[. */

Working with a plain external buffer

An external buffer has a data area which is managed by user code: Duktape just stores the current pointer and length and directs any read/write operations to the memory range indicated. User code is responsible for ensuring that this data area is valid for reading and writing, and must ensure the area eventually gets freed.

To create an external buffer:

/* Imaginary example: external buffer is a framebuffer allocated here. */
size_t framebuffer_len;
unsigned char *framebuffer_ptr = init_my_framebuffer(&framebuffer_len);

/* Push an external buffer.  Initially its data pointer is NULL and length
 * is zero.
 */
duk_push_external_buffer(ctx);

/* Configure the external buffer for a certain memory area using
 * duk_config_buffer().  The pointer is not returned because the
 * caller already knows it.
 */
duk_config_buffer(ctx, -1, (void *) framebuffer_ptr, (duk_size_t) framebuffer_len);

/* You can reconfigure the external buffer later as many times as
 * necessary.
 */

/* You can also get the current pointer and length explicitly.
 * The safest idiom is to do this right before reading/writing.
 */
ptr = (unsigned char *) duk_require_buffer(ctx, -1, &len);

Type checking

All plain buffer variants have stack type DUK_TYPE_BUFFER:

if (duk_is_buffer(ctx, idx_mybuffer)) {
    /* value is a plain buffer (fixed, dynamic, or external) */
}

Or equivalently:

if (duk_get_type(ctx, idx_mybuffer) == DUK_TYPE_BUFFER) {
    /* value is a plain buffer (fixed, dynamic, or external) */
}

Buffer objects

Here's a test case with some basic usage:

Creating buffer objects

Buffer objects and view objects are all created with the duk_push_buffer_object() API call:

/* Create a 1000-byte backing buffer.  Only part of the buffer is visible
 * for the view created below.
 */
duk_push_fixed_buffer(ctx, 1000);

/* Create an Uint16Array of 25 elements, backed by plain buffer at index -1,
 * starting from byte offset 100 and having byte length 50.
 */
duk_push_buffer_object(ctx,
                       -1 /*index of plain buffer*/,
                       100 /*byte offset*/,
                       50 /*byte (!) length */,
                       DUK_BUFOBJ_UINT16ARRAY /*flags and type*/);

This is equivalent to:

// Argument plain buffer
var plainBuffer = Uint8Array.allocPlain(1000);

// Create a Uint16Array over the existing plain buffer.
var view = new Uint16Array(plainBuffer.buffer,
                           100 /*byte offset*/,
                           25 /*length in elements (!)*/);

// Outputs: 25 100 50 2
print(view.length, view.byteOffset, view.byteLength, view.BYTES_PER_ELEMENT);

Note that the C call gets a byte length argument (50) while the ECMAScript equivalent gets an element length argument (25). This is intentional for consistency: in the C API buffer lengths are always represented as bytes.

Getting buffer object data pointer

To get the data pointer and length of a buffer object (also works for a plain buffer):

unsigned char *ptr;
duk_size_t len;
duk_size_t i;

/* Get a data pointer to the active slice of a buffer object.  Also
 * accepts a plain buffer.
 */
ptr = (unsigned char *) duk_require_buffer_data(ctx, -3 /*idx*/, &len);

/* You can now safely access indices [0, len[ of 'ptr'. */
for (i = 0; i < len; i++) {
    /* Uppercase ASCII characters. */
    if (ptr[i] >= (unsigned char) 'a' && ptr[i] <= (unsigned char) 'z') {
        ptr[i] += (unsigned char) ('A' - 'a');
    }
}

Type checking

There's currently no explicit type check API call for checking whether a value is a buffer object or not, or to check its specific type. However, the duk_is_buffer_data() API call returns true for both plain buffers and buffer objects:

if (duk_is_buffer_data(ctx, 0)) {
    /* ... */
}

Similarly, duk_get_buffer_data() and duk_require_buffer_data() accept both plain buffers and buffer objects and are a good default idiom to deal with buffer data in C code:

/* First argument must be a plain buffer or a buffer object. */
duk_size_t len;
char *buf = (char *) duk_require_buffer_data(ctx, 0, &len);
/* ... work with 'buf', valid offset range is [0,len[. */

Pointer stability and validity

Any buffer data pointers obtained through the Duktape API are invalidated when the plain buffer or buffer object is garbage collected. You must ensure the buffer is reachable for Duktape while you use a data pointer.

In addition to this, a buffer related data pointer may change from time to time:

  • For fixed buffers the data pointer is stable (until garbage collect).

  • For dynamic buffers the data pointer may change when the buffer is resized using duk_buffer_resize().

  • For external buffers the data pointer may change when the buffer is reconfigured using duk_buffer_config().

  • For buffer objects pointer stability depends on the underlying plain buffer.

Duktape cannot protect user code against using a stale pointer so it's important to ensure any data pointers used in C code are valid. The safest idiom is to always get the buffer data pointer explicitly before using it. For example, by default you should get the buffer pointer before a loop rather than storing it in a global (unless that is justified by e.g. a measurable performance benefit):

unsigned char *buf;
duk_size_t len, i;

buf = (unsigned char *) duk_require_buffer(ctx, -3 /*idx*/, &len);
for (i = 0; i < len; i++) {
    buf[i] ^= 0x80;  /* flip highest bit */
}

Because duk_get_buffer_data() and duk_require_buffer_data() work for both plain buffers and buffer objects, this is more generic:

unsigned char *buf;
duk_size_t len, i;

buf = (unsigned char *) duk_require_buffer_data(ctx, -3 /*idx*/, &len);
for (i = 0; i < len; i++) {
    buf[i] ^= 0x80;  /* flip highest bit */
}

Zero length buffers and NULL vs. non-NULL pointers

For technical reasons discussed below, a buffer with zero length may have either a NULL or a non-NULL data pointer. The pointer value doesn't matter as such because when the buffer length is zero, no read/write is allowed through the pointer (e.g. ptr[0] would refer to a byte outside the valid buffer range).

However, this has a practical impact on structuring code:

unsigned char *buf;
duk_size_t len;

buf = (unsigned char *) duk_get_buffer(ctx, -3, &len);
if (buf != NULL) {
    /* Value was definitely a buffer, buffer length may be zero. */
} else {
    /* Value was not a buffer -or- it might be a buffer with zero length
     * which also has a NULL data pointer.
     */
}

If you don't care about the typing, you can just ignore the pointer check and rely on len alone: for non-buffer values the data pointer will be NULL and length will be zero:

unsigned char *buf;
duk_size_t len, i;

/* If value is not a buffer, buf == NULL and len == 0. */
buf = (unsigned char *) duk_get_buffer(ctx, -3, &len);

/* Can use 'buf' and 'len' directly.  However, note that if len == 0,
 * there's no valid dereference for 'buf'.  This is OK for loops like:
 */
for (i = 0; i < len; i++) {
    /* Never entered if len == 0. */
    printf("%i: %d\n", (int) i, (int) buf[i]);
}

If you don't want that ambiguity you can check for the buffer type explicitly:

unsigned char *buf;
duk_size_t len, i;

/* duk_is_buffer() for plain buffers, duk_is_buffer_data() for plain
 * buffers or buffer objects.
 */
if (duk_is_buffer(ctx, -3)) {
    buf = (unsigned char *) duk_get_buffer(ctx, -3, &len);

    for (i = 0; i < len; i++) {
        /* Never entered if len == 0. */
        printf("%i: %d\n", (int) i, (int) buf[i]);
    }
}

If throwing an error for a non-buffer value is acceptable, this is perhaps the cleanest approach:

unsigned char *buf;
duk_size_t len, i;

/* Or duk_require_buffer_data(). */
buf = (unsigned char *) duk_require_buffer(ctx, -3, &len);

/* Value is definitely a buffer; buf may still be NULL but only if len == 0. */

for (i = 0; i < len; i++) {
    /* Never entered if len == 0. */
    printf("%i: %d\n", (int) i, (int) buf[i]);
}

The technical reason behind this behavior is different for each plain buffer variant:

  • The data area of a fixed buffer is allocated together with the buffer's heap header (it follows the header directly), so that the data pointer for a fixed buffer is always non-NULL, even if it has zero length. The data pointer is simply: (void *) ((duk_hbuffer *) heaphdr + 1).

  • The data area of a dynamic buffer is allocated using separate alloc/realloc call. ANSI C allows an implementation to return a NULL or some non-NULL pointer for a zero size malloc()/realloc(), as long as that pointer is properly ignored by a later free() call. This behavior is allowed for Duktape allocation functions too. Dynamic buffer zero length pointer behavior thus depends directly on the allocator functions used.

  • The data area of an external buffer is controlled by user code. User code can use a NULL or a non-NULL pointer for a zero length buffer; Duktape won't change the pointer value used.

Mixed use

In Duktape 2.0 plain buffers mimic Uint8Arrays and Node.js Buffer behavior has been aligned with Node.js v6.7.0 where Buffers are Uint8Array instances with a custom prototype.

As a result, in Duktape 2.0 it's no longer generally possible (or necessary) to mix buffer types as in Duktape 1.x, where e.g. a Duktape.Buffer could be used as an input argument to new Uint16Array() with some custom behavior.

Common issues and best practices

Resizing and/or appending data to a buffer

Neither the standard ArrayBuffer nor the Node.js Buffer type allow buffer resizing so there's no easy way to efficiently append data to an ArrayBuffer or a Node.js buffer. A trivial but inefficient approach is to always create a new buffer for appended data:

// Node.js example
var data = new Buffer(0);

function received(buf) {
    data = Buffer.concat([ data, buf ]);
}

A better common technique is to accumulate parts and concatenate them when input is finished:

// Node.js example
var parts = [];

function received(buf) {
    parts.push(buf);
}

function finalize() {
    var final = Buffer.concat(parts);
}

Another efficient approach is to keep some spare and avoid excessive copying by e.g. doubling the buffer whenever you're out of space:

// Typed array example
var data = new Uint8Array(64);
var offset = 0;

function received(buf) {
    // Incoming data ('buf') is an Uint8Array

    while (data.length - offset < buf.byteLength) {
        // Not enough space, resize to make room.
        var newBuf = new Uint8Array(data.length * 2);
        newBuf.set(data);  // copy old bytes
        data = newBuf;
    }

    data.set(new Uint8Array(buf), offset);
    offset += buf.byteLength;
}

// When accumulation is finished, final data can be extracted as follows:
var finalArrayBuffer = data.buffer.slice(0, offset);

If you want to use a Duktape specific solution, a dynamic plain buffer can be resized on-the-fly with minimal cost. A dynamic buffer appears to ECMAScript code as an ArrayBuffer whose .length and .byteLength will simply change to reflect a resize of the buffer. Dynamic plain buffers can only be resized from C code. External plain buffers can also be reconfigured on-the-fly which allows e.g. resizing.

Avoiding Duktape custom behaviors

It's best to start with ES2015 typed arrays because they are the "best standard" for buffers in ECMAScript. When doing so, avoid Duktape specific behavior unless you really need to. Particular gotchas are discussed below.

Avoid relying on memory zeroing of Node.js Buffers

ES2015 specification requires that new ArrayBuffer values be filled with zeroes. Starting from Duktape 1.4.0, Duktape follows this even when the DUK_USE_ZERO_BUFFER_DATA config option is turned off.

Node.js does not zero allocated Buffer objects by default. Duktape zeroes Node.js Buffer objects too, unless the DUK_USE_ZERO_BUFFER_DATA config option is turned off.

Security considerations

Duktape guarantees that no out-of-bounds accesses are possible to an underlying plain buffer by any ECMAScript code.

This guarantee is in place even if you initialize a buffer object using a dynamic plain buffer which is then resized so that the conceptual buffer object extends beyond the resized buffer. In such cases Duktape doesn't provide very clean behavior (some operations return zero, others may throw a TypeError, etc) but the behavior is guaranteed to be memory safe. This situation is illustrated (and tested for) in the following test case:

C code interacting with buffers through property reads/writes is guaranteed to be memory safe. C code may fetch a pointer and a length to an underlying buffer and operate on that directly; memory safety is up to user code in that situation.

When an external plain buffer is used, it's up to user code to ensure that the pointer and length configured into the buffer are valid, i.e. all bytes in that range are readable and writable. If this is not the case, memory unsafe behavior may happen.