Webassembly: possible to have shared objects? - javascript

I am wondering if, using C (or C++ or Rust) and javascript, I am able to do CRUD operations to a shared data object. Using the most basic example, here would be an example or each of the operations:
#include <stdio.h>
typedef struct Person {
int age;
char* name;
} Person;
int main(void) {
// init
Person* sharedPersons[100];
int idx=0;
// create
sharedPersons[idx] = (Person*) {12, "Tom"};
// read
printf("{name: %s, age: %d}", sharedPersons[idx]->name, sharedPersons[idx]->age);
// update
sharedPersons[idx]->age = 11;
// delete
sharedPersons[idx] = NULL;
}
Then, I would like to be able to do the exact same thing in Javascript, and both be able to write to the same shared sharedPersons object. How could this be done? Or does the setup need to be something like a 'master-slave' where one just needs to pass back information to the other and the master does all the relevant actions? I'm hoping that there's a way do CRUD on a shared data object in webassembly, and any help would be greatly appreciated.
As a reference: https://rustwasm.github.io/wasm-bindgen/contributing/design/js-objects-in-rust.html

Creating the object
Let's create the object in C and return it:
typedef struct Person {
int age;
char* name;
} Person;
Person *get_persons(void) {
Person* sharedPersons[100];
return sharedPersons;
}
You could also create the object in JS, but it's harder. I'll come back to this later.
In order for JS to get the object, we've defined a function (get_persons) that returns (a pointer to) it. In this case it's an array, but of course it could have been a single object. The thing is, there must be a function that will be called from JS and that provides the object.
Compiling the program
emcc \
-s "SINGLE_FILE=1" \
-s "MODULARIZE=1" \
-s "ALLOW_MEMORY_GROWTH=1" \
-s "EXPORT_NAME=createModule" \
-s "EXPORTED_FUNCTIONS=['_get_persons', '_malloc', '_free']" \
-s "EXPORTED_RUNTIME_METHODS=['cwrap', 'setValue', 'getValue', 'AsciiToString', 'writeStringToMemory']" \
-o myclib.js
person.c
I don't remember why we have a leading underscore in _get_persons, but that's how Emscripten works.
Getting the object in JS
const createModule = require('./myclib');
let myclib;
let Module;
export const myclibRuntime = createModule().then((module) => {
get_persons: Module.cwrap('get_persons', 'number', []),
});
What this does is create a get_persons() JS function that is a wrapper of the C get_persons() function. The return value of the JS function is "number". Emscripten knows that the C get_persons() function returns a pointer, and the wrapper will convert that pointer to a JS number. (Pointers in WASM are 32-bit.)
Manipulating the object in JS
const persons = get_persons();
Module.getValue(persons, 'i32'); // Returns the age of the first person
Module.AsciiToString(Module.getValue(persons + 4, 'i32')); // Name of first person
// Set the second person to be "Alice", age 18
const second_person = persons + 8;
Module.setValue(second_person, 18, 'i32');
const buffer = Module._malloc(6); // Length of "Alice" plus the null terminator
Module.writeStringToMemory("Alice", buffer);
Module.setValue(second_person + 4, buffer, 'i32');
This is a fairly low level way of doing it, although there seems to be an even lower level way. As other people have suggested, there may be higher level tools to help in C++ and Rust.
Creating the object in JS
You can create objects in JS by using _malloc() (and free them with _free()) as we did with the string above, and then pass their pointers to C functions. But, as I said, creating them in C is probably easier. In any case, anything _malloc()ed must eventually be freed (so the string creation above is incomplete). The FinalizationRegistry can help with this.

Yes, this is possible.
WebAssembly stores objects within linear memory, a contiguous array of bytes that the module can read and write to. The host environment (typically JavaScript within the web browser) can also read and write to linear memory, allowing it to access the objects that the WebAssembly modules stores there.
There are two challenges here:
How do you find where your WebAssembly module has stored an object?
How is the object encoded?
You need to ensure that you can read and write these objects from both the WebAssembly module and the JavaScript host.
I'd pick a known memory location, and a known serialisation format and use that to read/write from both sides.

Related

JavaScript: Most efficient and performant way to partition ArrayBuffer memory

I would compare what I am doing to what JavaScript runtimes already do, yet I'm doing it in JavaScript and Wasm. JavaScript implementations store JavaScript objects and values in actual computer heap memory, yet performing operations such as attempting to read/write out of bounds memory don't actually modify the memory (ex: arrays perform a no-op and return undefined respectively).
I'll give an example of my specific situation:
Let's say that I have an array buffer of 1000 bytes, we'll name the variable memory.
I want to split apart the buffer specifically into Int32Arrays of size 4. Each partition from the ArrayBuffer must do two things:
a) Refer to the original buffer (so that, when the original data is manipulated, the partition will update its values automaticially)
b) Not expose the original buffer (as the partition could then be used to corrupt the other partitions)
I have a function that determines which section is available for usage, we'll call it findPartition. It returns an integer acting as a pointer to a set of available bytes. (like C's malloc)
Each partition is expected to always remain the same type, that is, they will always be Int32Arrays if they start as an Int32Array, and their size will always be constant.
The script operating on the partition may both, write to, and read from, its partitioned array.
Originally, I was thinking that I could just call the Int32Array constructor on my array buffer, simply like so:
const createPartition = () => new Int32Array( memory, findPartition(), 4 );
The problem is that the buffer is exposed, so I could either delete the buffer property.
But... the buffer property is readonly, so delete fails when used on the array.
I then thought that I could make a class to do this:
class Partition {
#source = new Int32Array( memory, findPartition(), 4 );
get 0() { return this.#source[0]; }
set 0(x) { this.#source[0] = x; }
get 1() { return this.#source[1]; }
set 1(x) { this.#source[1] = x; }
get 2() { return this.#source[2]; }
set 2(x) { this.#source[2] = x; }
get 3() { return this.#source[3]; }
set 3(x) { this.#source[3] = x; }
get length() { return 4; }
};
Well, that works, but it's much more verbose, thus harder to maintain later, and, as the partitions are not given direct access to the indexes' values, because they have to go through getters and setters, I feel that performance could be lost.
Ideally, the Int32Array.prototype is also on the object, so I would have to wrap everything, which would be annoying and unmaintainable. If the spec updates the methods of the prototype, then I would have to update the wrappers too.
Does anyone have a better way to segment the array buffer, while maintaining safety between the segments?
Simplest way is to extend chosen typed array like that:
// Seal and freeze hidden Object (TypedArray.prototype) that has methods
// that can leak original buffer if attacked using defineProperty tricks
// Since we can't directly access hidden Object on which `subarray`
// and many other methods defined we use this workaround
// Paranoid: More checks required to make sure that: `subarray` method; 'byteOffset',
// 'byteLength', 'buffer' getters; are not modified beforehand
Object.seal(Int8Array.__proto__.prototype);
Object.freeze(Int8Array.__proto__.prototype);
class customUint32Array extends Uint32Array {
get buffer(){
// copy! viewed array buffer segment
// test if `super.` is faster/slower than `this.` access
return super.buffer.slice(this.byteOffset, this.byteOffset + this.byteLength);
// return super.buffer.slice(super.byteOffset, super.byteOffset + super.byteLength);
}
}
// var customUint32ArrayOverWholeBufferCached = new customUint32Array(memory);
function Partition(){
// test performance of `new` vs `customUint32ArrayOverWholeBufferCached.subarray`
// for fastest array buffer view creation
return new customUint32Array(memory, findPartitionByteOffset(), 4);
// return customUint32ArrayOverWholeBufferCached.subarray(findPartitionIndex(), 4);
}
By the way 'private' class properties in most JS environments are exposed as any other property and will leak original buffer.
Prototype chains forged manually instead of class X extends Y are welcome in comments.
If one will pass my original buffer leak tests, I'll include it here.
Current instance' prototype chain looks something like: customUint32Array.Uint32Array.TypedArray.prototype.Object

Is it possible to access the value of a Symbol in JavaScript?

I have been just introduced to the concept of Symbols in JavaScript. I have been informed that they can be used to create unique identifiers to avoid potential clashes.
For example...
let user = {
name : "John",
Symbol("id"): 123, // Symbol("id") is a unique identifier
Symbol("id"): 123 // and the Symbol("id") here is also different and unique
};
... I understand the above code. However, what is the actual identifier value of each "Symbol("id")"? How do I find out?
Any pointers appreciated.
No, you cannot see the "raw" value of the symbol in the JS environment, because it is implemented using the native C ++ code of the JS engine itself, and this implementation does not provide an opportunity to display it anywhere in the console or on a page.
You can think of symbols as big numbers and every time you create a symbol, a new random number gets generated (uuid). You can use that symbol (the random big number) as a key in objects.
Regarding this definition, you can look at a possible implementation of the symbol in ES5:
var Symbol;
if (!Symbol) {
Symbol = (function(Object){
// (C) WebReflection Mit Style License
var ObjectPrototype = Object.prototype,
defineProperty = Object.defineProperty,
prefix = '__simbol' + Math.random() + '__',
id = 0;
//... some other code
You can see that Math.random() is used here, so Symbol there will have long big number as a his main property (unique with high possibility).
Another way to explain what a symbol is is that a Symbol is just a piece of memory in which you can store some data. Each symbol will point to a different memory location. In the context of this definition, you can see the source code in C++ of the JS engine itself, for example V8 that is used in Chromium. If you know C++, you can try to find an implementation of Symbol() constructor there, but it won't be easy.
Therefore, we can say that a Symbol is a kind of unique memory area that a certain string describes. The memory area itself it's already term from a low-level programming, so you can think of it as something like 1010010011...
const id1 = Symbol("id");
const id2 = Symbol("id");
const user = {
name: "John",
[id1]: 123, // "[id1]" is a unique identifier
[id2]: 456, // and the value of "[id2]" here is also different
};
console.log('id1:', user[id1], id1.description);
console.log('id2:', user[id2], id2.description);
I wasn't able to get your question properly, I tried to help you hope this will work
let user = { // belongs to another code
name: "Alex"
};
let id = Symbol("id");
user[id] = 200;
alert( user[id] ); // we can access the data using the symbol as the key
From mdn:
Every symbol value returned from Symbol() is unique. A symbol value
may be used as an identifier for object properties; this is the data
type's only purpose.
console.log(Symbol('foo') === Symbol('foo'))
From an article in this answer:
ES6 symbols are similar to the more traditional symbols in languages
like Lisp and Ruby, but not so closely integrated into the language.
In Lisp, all identifiers are symbols. In JS, identifiers and most
property keys are still considered strings. Symbols are just an extra
option.
As mdn docs explains, you could access the Description that you passed but not the value of the Symbol:
Most values in JavaScript support implicit conversion to a string. For
instance, we can alert almost any value, and it will work. Symbols are
special. They don’t auto-convert.
For example,
let Sym = Symbol("Sym");
alert(Sym); // TypeError: Cannot convert a Symbol value to a string
That’s a "language guard" against messing up, because strings and
symbols are fundamentally different and should not occasionally
convert one into another.
If we really want to show a symbol, we need to call .toString() on it,
for example,
let Sym = Symbol("Sym");
alert(Sym.toString()); // Symbol(Sym), now it works
Or we can use get symbol.description property to get
the description on it, for example,
let _Sym = Symbol("Sym");
alert(_Sym.description); // Sym

Handling output buffers in emscripten

Lets say I have a C API as follows:
void get_result_buffer(context* ctx, void** result, size_t* result_size);
Where context is some arbitrary opaque context type holding state. The intended way to call this is
context* ctx = ...;
do_something_with_context(ctx, ...);
void* result_buffer = 0;
size_t result_buffer_size = 0;
get_result_buffer(ctx, &result_buffer, &result_buffer_size);
/* Now result_buffer and result_buffer_size are meaningful and populated with the results of having called `do_something_with_context`. */
The result_buffer is owned by the context object, so the caller doesn't need to free it. Now I'd like to be able to call get_result_buffer from Emscripten. I can easily enough set up cwrap for this, it looks something like:
wrap_get_result_buffer = something.cwrap(
'get_result_buffer',
null,
['number', 'number', 'number']
)
But I'm unclear how I can set things up so that the out parameters "work" in JS. Ideally, at the end, I'd have something that looks like a byte buffer containing a copy of the data pointed to by the result out parameter, with a length as described by the result_size out parameter.
It seems that the values that I pass in need to be allocated somehow, and then I would pass the resulting allocation handle in as the number type parameters, but I have no idea how to do that in the JS/Emscripten layer. Similarly, after the call, I'd expect that those values have now been updated by the transpiled C code, but I'm unclear on how to extract the now populated data into some sort of JS byte array.
Any guidance on how to do this or pointers to example code?
OK. I figured this out. For future emscripteners, you want to do something like the following.
var out_data_ptr = Module._malloc(8)
var out_data_array = new Uint32Array(Module.HEAPU32.buffer, out_data_ptr, 2)
wrap_get_result_buffer(context, out_data_ptr, out_data_ptr + 4)
var response_uint8_array = new Uint8Array(Module.HEAPU8.buffer, out_data_array[0], out_data_array[1])
Module._free(out_data_ptr)
The theory of operations here is that we create a two element array that will store the 'slots' to be filled in by calling get_result_buffer, and then construct a view over that exposing it as two number compatible elements. We then pass those in to our get_result_buffer function as lifted above with cwrap. After that, the heap memory that the context refers to is reachable from those slots, which can then be used to construct a Uint8Array that provides JS level access to the bytes in the result.

Web assembly : sending data from js to argv argument of C++ main function

I want to send an array of string to my "main" C++ function.
I have tried this:
this.module = buildModule(args).then((res) => {
const main = res.cwrap('main', 'number', ['number', 'array'])
// create a pointer using the 'Glue' method and the String value
const ptr1 = res.allocate(wasmBuild.intArrayFromString('test'), 'i8', 0)
main(2, [ptr1])
}
I receive the int (2) but the "argv" arguments print only garbage on the other size...
What am I doing wrong?
Also, has anyone be able to use the 'array' argument?
According to the doc:
The [parameter] types are (...) “array” (for
a JavaScript array or typed array that corresponds to a C array; for
typed arrays, it must be a Uint8Array or Int8Array).

How to use C++ allocated array in Emscripten generated code?

I have a C++ code like this:
extern "C" {
void MyCoolFunction (int** values)
{
int howManyValuesNeeded = 5;
*values = new int[howManyValuesNeeded];
for (int i = 0; i < howManyValuesNeeded; i++) {
(*values)[i] = i;
}
}
}
From C++ it can be used like this:
int *values = NULL;
MyCoolFunction (&values);
// do something with the values
delete[] values;
Of course the real code is much more complicated, but the point is that the function allocates an int array inside, and it decides what the array size will be.
I translated this code with Emscripten, but I don't know how could I access the array allocated inside the function from javascript. (I already know how to use exported functions and pointer parameters with Emscripten generated code, but I don't know how to solve this problem.)
Any ideas?
In Emscripten, memory is stored as a giant array of integers, and pointers are just indexes into that array. Thus, you can pass pointers back and forth between C++ and Javascript just like you do integers. (It sounds like you know how to pass values around, but if not, go here.)
Okay. Now if you create a pointer on the C++ side (as in your code above) and pass it over to Javascript, Emscripten comes with a handful of helper functions to allow you to access that memory. Specifically setValue and getValue.
Thus, if you passed your values variable into JS and you wanted to access index 5, you would be able to do so with something like:
var value5 = getValue(values+(5*4), 'i32');
Where you have to add the index times the number of bytes (5*4) to the pointer, and indicate the type (in this case 32 bit ints) of the array.
You can call the delete from JavaSCript by wrapping it inside another exported function.
extern "C" { ...
void MyCoolFunction (int** values);
void finsih_with_result(int*);
}
void finsih_with_result(int *values) {
delete[] values;
}
Alternatively you may also directly do this on JavaScript side: Module._free(Module.HEAPU32[values_offset/4]) (or something like that; code not tested).

Categories

Resources