About NodeJS Buffers - javascript

Question about buffers.
if i have this native code:
unsigned char* data = static_cast<unsigned char*>(malloc(sizeof(int) * 4));
Napi::Buffer<unsigned char> buffer = Napi::Buffer::New(info.Env(), data, sizeof(int) * 4); // create buffer
return buffer;
it would return the buffer to js, does the GC handle the malloc'd data? or would it cause a leak since i didn't free it. The reason i asked this here is because i am dealing with js buffers

It looks like you should manually free it use other overload
https://github.com/nodejs/node-addon-api/blob/master/doc/buffer.md#new-2
The Napi::Buffer object does not assume ownership for the data and expects it to be valid for the lifetime of the object. The data can only be freed once the finalizeCallback is invoked to indicate that the Napi::Buffer has been released.
template <typename Finalizer>
static Napi::Buffer<T> Napi::Buffer::New(napi_env env
T* data,
size_t length,
Finalizer finalizeCallback);
note : I haven't use NAPI before but I think it's the correct document.

Related

Webassembly: possible to have shared objects?

I am wondering if, using C (or C++ or Rust) and javascript, I am able to do CRUD operations to a shared data object. Using the most basic example, here would be an example or each of the operations:
#include <stdio.h>
typedef struct Person {
int age;
char* name;
} Person;
int main(void) {
// init
Person* sharedPersons[100];
int idx=0;
// create
sharedPersons[idx] = (Person*) {12, "Tom"};
// read
printf("{name: %s, age: %d}", sharedPersons[idx]->name, sharedPersons[idx]->age);
// update
sharedPersons[idx]->age = 11;
// delete
sharedPersons[idx] = NULL;
}
Then, I would like to be able to do the exact same thing in Javascript, and both be able to write to the same shared sharedPersons object. How could this be done? Or does the setup need to be something like a 'master-slave' where one just needs to pass back information to the other and the master does all the relevant actions? I'm hoping that there's a way do CRUD on a shared data object in webassembly, and any help would be greatly appreciated.
As a reference: https://rustwasm.github.io/wasm-bindgen/contributing/design/js-objects-in-rust.html
Creating the object
Let's create the object in C and return it:
typedef struct Person {
int age;
char* name;
} Person;
Person *get_persons(void) {
Person* sharedPersons[100];
return sharedPersons;
}
You could also create the object in JS, but it's harder. I'll come back to this later.
In order for JS to get the object, we've defined a function (get_persons) that returns (a pointer to) it. In this case it's an array, but of course it could have been a single object. The thing is, there must be a function that will be called from JS and that provides the object.
Compiling the program
emcc \
-s "SINGLE_FILE=1" \
-s "MODULARIZE=1" \
-s "ALLOW_MEMORY_GROWTH=1" \
-s "EXPORT_NAME=createModule" \
-s "EXPORTED_FUNCTIONS=['_get_persons', '_malloc', '_free']" \
-s "EXPORTED_RUNTIME_METHODS=['cwrap', 'setValue', 'getValue', 'AsciiToString', 'writeStringToMemory']" \
-o myclib.js
person.c
I don't remember why we have a leading underscore in _get_persons, but that's how Emscripten works.
Getting the object in JS
const createModule = require('./myclib');
let myclib;
let Module;
export const myclibRuntime = createModule().then((module) => {
get_persons: Module.cwrap('get_persons', 'number', []),
});
What this does is create a get_persons() JS function that is a wrapper of the C get_persons() function. The return value of the JS function is "number". Emscripten knows that the C get_persons() function returns a pointer, and the wrapper will convert that pointer to a JS number. (Pointers in WASM are 32-bit.)
Manipulating the object in JS
const persons = get_persons();
Module.getValue(persons, 'i32'); // Returns the age of the first person
Module.AsciiToString(Module.getValue(persons + 4, 'i32')); // Name of first person
// Set the second person to be "Alice", age 18
const second_person = persons + 8;
Module.setValue(second_person, 18, 'i32');
const buffer = Module._malloc(6); // Length of "Alice" plus the null terminator
Module.writeStringToMemory("Alice", buffer);
Module.setValue(second_person + 4, buffer, 'i32');
This is a fairly low level way of doing it, although there seems to be an even lower level way. As other people have suggested, there may be higher level tools to help in C++ and Rust.
Creating the object in JS
You can create objects in JS by using _malloc() (and free them with _free()) as we did with the string above, and then pass their pointers to C functions. But, as I said, creating them in C is probably easier. In any case, anything _malloc()ed must eventually be freed (so the string creation above is incomplete). The FinalizationRegistry can help with this.
Yes, this is possible.
WebAssembly stores objects within linear memory, a contiguous array of bytes that the module can read and write to. The host environment (typically JavaScript within the web browser) can also read and write to linear memory, allowing it to access the objects that the WebAssembly modules stores there.
There are two challenges here:
How do you find where your WebAssembly module has stored an object?
How is the object encoded?
You need to ensure that you can read and write these objects from both the WebAssembly module and the JavaScript host.
I'd pick a known memory location, and a known serialisation format and use that to read/write from both sides.

JavaScript: Most efficient and performant way to partition ArrayBuffer memory

I would compare what I am doing to what JavaScript runtimes already do, yet I'm doing it in JavaScript and Wasm. JavaScript implementations store JavaScript objects and values in actual computer heap memory, yet performing operations such as attempting to read/write out of bounds memory don't actually modify the memory (ex: arrays perform a no-op and return undefined respectively).
I'll give an example of my specific situation:
Let's say that I have an array buffer of 1000 bytes, we'll name the variable memory.
I want to split apart the buffer specifically into Int32Arrays of size 4. Each partition from the ArrayBuffer must do two things:
a) Refer to the original buffer (so that, when the original data is manipulated, the partition will update its values automaticially)
b) Not expose the original buffer (as the partition could then be used to corrupt the other partitions)
I have a function that determines which section is available for usage, we'll call it findPartition. It returns an integer acting as a pointer to a set of available bytes. (like C's malloc)
Each partition is expected to always remain the same type, that is, they will always be Int32Arrays if they start as an Int32Array, and their size will always be constant.
The script operating on the partition may both, write to, and read from, its partitioned array.
Originally, I was thinking that I could just call the Int32Array constructor on my array buffer, simply like so:
const createPartition = () => new Int32Array( memory, findPartition(), 4 );
The problem is that the buffer is exposed, so I could either delete the buffer property.
But... the buffer property is readonly, so delete fails when used on the array.
I then thought that I could make a class to do this:
class Partition {
#source = new Int32Array( memory, findPartition(), 4 );
get 0() { return this.#source[0]; }
set 0(x) { this.#source[0] = x; }
get 1() { return this.#source[1]; }
set 1(x) { this.#source[1] = x; }
get 2() { return this.#source[2]; }
set 2(x) { this.#source[2] = x; }
get 3() { return this.#source[3]; }
set 3(x) { this.#source[3] = x; }
get length() { return 4; }
};
Well, that works, but it's much more verbose, thus harder to maintain later, and, as the partitions are not given direct access to the indexes' values, because they have to go through getters and setters, I feel that performance could be lost.
Ideally, the Int32Array.prototype is also on the object, so I would have to wrap everything, which would be annoying and unmaintainable. If the spec updates the methods of the prototype, then I would have to update the wrappers too.
Does anyone have a better way to segment the array buffer, while maintaining safety between the segments?
Simplest way is to extend chosen typed array like that:
// Seal and freeze hidden Object (TypedArray.prototype) that has methods
// that can leak original buffer if attacked using defineProperty tricks
// Since we can't directly access hidden Object on which `subarray`
// and many other methods defined we use this workaround
// Paranoid: More checks required to make sure that: `subarray` method; 'byteOffset',
// 'byteLength', 'buffer' getters; are not modified beforehand
Object.seal(Int8Array.__proto__.prototype);
Object.freeze(Int8Array.__proto__.prototype);
class customUint32Array extends Uint32Array {
get buffer(){
// copy! viewed array buffer segment
// test if `super.` is faster/slower than `this.` access
return super.buffer.slice(this.byteOffset, this.byteOffset + this.byteLength);
// return super.buffer.slice(super.byteOffset, super.byteOffset + super.byteLength);
}
}
// var customUint32ArrayOverWholeBufferCached = new customUint32Array(memory);
function Partition(){
// test performance of `new` vs `customUint32ArrayOverWholeBufferCached.subarray`
// for fastest array buffer view creation
return new customUint32Array(memory, findPartitionByteOffset(), 4);
// return customUint32ArrayOverWholeBufferCached.subarray(findPartitionIndex(), 4);
}
By the way 'private' class properties in most JS environments are exposed as any other property and will leak original buffer.
Prototype chains forged manually instead of class X extends Y are welcome in comments.
If one will pass my original buffer leak tests, I'll include it here.
Current instance' prototype chain looks something like: customUint32Array.Uint32Array.TypedArray.prototype.Object

Forward arraybuffer from C to JS with node-api

Im currently trying to do some low level coding with JS.
For that reason im using https://nodejs.org/api/n-api.html to add custom C code to my node.js runtime.
I get passing values and changing them in c to work, even reading arraybuffers and interpreting them the way i want to in C, but im only able to return limited JS values (numbers and strings, as seen in this part https://nodejs.org/api/n-api.html#n_api_functions_to_convert_from_c_types_to_n_api)
Does anybody know how to get N-API arraybuffers? I'd want give my JS a certain buffer i defined in C, and then work via Dataviews.
I found the answer:
https://nodejs.org/api/n-api.html#n_api_napi_create_external_arraybuffer
I was looking for different keywords than "external", but this is exactly what I looked for:
You define a buffer in C beforehand and then create a NAPI/JS array buffer which uses that underlying buffer.
napi_create_arraybuffer would clear the buffer, which could then be manipulated in C as well, but you couldn't e.g. load a file and then use that buffer. So napi_create_external_arraybuffer is the way to go.
edit: when I asked the question I was writing my open-source bachelor's thesis, so here's how I used it in the end: https://github.com/ixy-languages/ixy.js/blob/ce1d7130729860245527795e483b249a3d92a0b2/src/module.c#L112
I don‘t know if this helps (I‘m also relatively new to N-API.) but you can create an arraybuffer from a void* and a fixed length: https://nodejs.org/api/n-api.html#n_api_napi_create_arraybuffer
For example:
napi_value CreateArrayBuffer(napi_env env, napi_callback_info info) {
// the value to return
napi_value arrayBuffer;
// allocates 100 bytes for the ArrayBuffer
void* yourPointer = malloc(100 /* bytes */);
// creates your ArrayBuffer
napi_create_arraybuffer(env, 100 /* bytes */, &yourPointer, &arrayBuffer);
return arrayBuffer; // ArrayBuffer with 100 bytes length
}

Handling output buffers in emscripten

Lets say I have a C API as follows:
void get_result_buffer(context* ctx, void** result, size_t* result_size);
Where context is some arbitrary opaque context type holding state. The intended way to call this is
context* ctx = ...;
do_something_with_context(ctx, ...);
void* result_buffer = 0;
size_t result_buffer_size = 0;
get_result_buffer(ctx, &result_buffer, &result_buffer_size);
/* Now result_buffer and result_buffer_size are meaningful and populated with the results of having called `do_something_with_context`. */
The result_buffer is owned by the context object, so the caller doesn't need to free it. Now I'd like to be able to call get_result_buffer from Emscripten. I can easily enough set up cwrap for this, it looks something like:
wrap_get_result_buffer = something.cwrap(
'get_result_buffer',
null,
['number', 'number', 'number']
)
But I'm unclear how I can set things up so that the out parameters "work" in JS. Ideally, at the end, I'd have something that looks like a byte buffer containing a copy of the data pointed to by the result out parameter, with a length as described by the result_size out parameter.
It seems that the values that I pass in need to be allocated somehow, and then I would pass the resulting allocation handle in as the number type parameters, but I have no idea how to do that in the JS/Emscripten layer. Similarly, after the call, I'd expect that those values have now been updated by the transpiled C code, but I'm unclear on how to extract the now populated data into some sort of JS byte array.
Any guidance on how to do this or pointers to example code?
OK. I figured this out. For future emscripteners, you want to do something like the following.
var out_data_ptr = Module._malloc(8)
var out_data_array = new Uint32Array(Module.HEAPU32.buffer, out_data_ptr, 2)
wrap_get_result_buffer(context, out_data_ptr, out_data_ptr + 4)
var response_uint8_array = new Uint8Array(Module.HEAPU8.buffer, out_data_array[0], out_data_array[1])
Module._free(out_data_ptr)
The theory of operations here is that we create a two element array that will store the 'slots' to be filled in by calling get_result_buffer, and then construct a view over that exposing it as two number compatible elements. We then pass those in to our get_result_buffer function as lifted above with cwrap. After that, the heap memory that the context refers to is reachable from those slots, which can then be used to construct a Uint8Array that provides JS level access to the bytes in the result.

How to transfer large objects using postMessage of webworker?

I have read that transferable objects can be transferred really fast using postmessage of web worker. According to this transferable objects are either arraybuffer or messageport.
Question is, how do I convert say an arbitrary object that is of large size (30 mb) to a transferable object and pass it as an argument to postmessage. From what I understand I can convert my array to json string and then convert json string to raw byte data and store that inside of an array object. However, this seems to defeat the purpose of fast transferring.
could someone enlighten me to pass an object as transferable object or if it's even possible?
Thanks in advance!
This misconception is quite recurring here. You're imagining that it's possible to write some fast javascript code to convert your large object into transferable. But indeed, any conversion code you write defeats the purpose, just as you said. And the more complex data, the more speed you lose.
Objects are normally (when not transfering) converted by native structured clone algorithm (which uses implementation defined format and sure is optimal). Any javascript code you write will most likely be slower than structured clone, while achieving the same goal - transferring data as binary.
The purpose of transferable objects is to allow transfer for binary data, such as images (from canvas), audio or video. These kinds of data can be transferred without being processed by structured clone algorithm, which is why transferable interface was added. And the effect is insignificant even for these - see an answer about transferable speed.
As a last note, I wrote a prototype based library that converts javascript objects to ArrayBuffer and back. It's slower, especially for JSON like data. It's advantages (and advantages of any similar code you write) are:
You can define custom object conversions
You can use inheritance (eg. sending your own types, like Foo)
Code to transfer JSON like object
If your data is like JSON, just stick to structured clone and do not transfer. If you don't trust me, test it with this code. You will see it's slower than normal postMessage.
var object = {dd:"ddd", sub:{xx:"dd"}, num:666};
var string = JSON.stringify(object);
var uint8_array = new TextEncoder(document.characterSet.toLowerCase()).encode(string);
var array_buffer = uint8_array.buffer;
// now transfer array buffer
worker.postMessage(array_buffer, [array_buffer])
The opposite conversion, considering you have some ArrayBuffer:
// Let me just generate some array buffer for the simulation
var array_buffer = new Uint8Array([123,34,100,100,34,58,34,100,100,100,34,44,34,115,117,98,34,58,123,34,120,120,34,58,34,100,100,34,125,44,34,110,117,109,34,58,54,54,54,125]).buffer;
// Now to the decoding
var decoder = new TextDecoder("utf-8");
var view = new DataView(array_buffer, 0, array_buffer.byteLength);
var string = decoder.decode(view);
var object = JSON.parse(string);
Should have looked up Tomas's answer earlier.
Proof, although not specifically the way Tomas suggested.
Version A
Version B
I manually converted to a stringified json obejct to a Uint8Array like so:
function stringToUintArray(message) {
var encoded = self.btoa(message);
var uintArray = Array.prototype.slice.call(encoded).map(ch => ch.charCodeAt(0));
var uarray = new Uint8Array(uintArray);
return uarray;
}
and transferred it like so from the web worker to main thread:
console.time('generate');
var result = generate(params.low, params.high, params.interval, params.size);
var uarr = stringToUintArray(JSON.stringify(result));
console.timeEnd('generate');
self.postMessage(uarr.buffer, [uarr.buffer]);
and on the main thread I did something like this:
var uarr = new Uint8Array(e.data);
var json = UintArrayToString(uarr);
var result = JSON.parse(json);

Categories

Resources