How to produce a `ArrayBuffer` from `bytes` using `js_of_ocaml` - javascript

I am building a JavaScript library that is implemented in Ocaml and compiled to JavaScript using js_of_ocaml.
One of my Ocaml function returns a string with binary data. How can I expose that using js_of_ocaml as a ArrayBuffer?

When you compile to javascript, manipulating binary data in strings is extremely bug prone!
The underlying reason is questionable choice of js_of_ocaml:
Because javascript strings are encoded in UTF16 whereas OCaml ones are (implicitly) encoded in UTF8, js_of_ocaml tries to navigate in between the 2. Therefore, when it encounters a "character" whose code is > 127, js_of_ocaml converts it which is a disaster if it is, in fact, raw binary data!
The solution is to manipulate bigstrings instead of strings.
Bigstrings are (char, Bigarray.int8_unsigned_elt, Bigarray.c_layout)
Bigarray.Array1.t in raw OCaml but more and more libraries aliases them.
Especially, they are Typed_array.​Bigstring.t in js_of_ocaml (where you can see functions to convert from and to ArrayBuffer)
If your function does work by magic on string once compiled in javascript, there are translation function in between bigstrings and strings in several places.
For example the bigstring library: http://c-cube.github.io/ocaml-bigstring/ but these functions are also available in Lwt_bytes of lwt
You can see an other question on the same subject (including ways to manipulate OCaml string in javascript while not touching them at all using gen_js_api) at
https://discuss.ocaml.org/t/handling-binary-data-in-ocaml-and-javascript/1519

Related

CryptoJS.enc.Base64.parse vs Base64.decodeBase64, what's the difference?

Want to understand how these two are different? Or they are same?
var key2 = CryptoJS.enc.Base64.parse(apiKey);
&
byte[] decodedBase64APIKeyByteArray = Base64.decodeBase64(apiKey);
I have gone through the APIs of both but seems like both are doing conversions but my question is would the conversion be same for same input?
Will the output for both would be same?
Both decode normal base64 with the default base64 alphabet including possible padding characters at the end.
There are a few differences however.
Documentation: The commons-codec one is at least somewhat documented.
The input: The commons-codec allows base64 and removes line endings and such (required for e.g. MIME decoding). A quick look at the CryptoJS code shows that it requires base64 without whitespace. So the Java based decoder allows different forms of input.
The implementation: The CryptoJS parsing brings tears to my eyes, and not of joy. It has terrible performance, if just on how it handles the base 64 without streaming. It even is stupid enough to use an indexOf to lookup possible padding characters up front, which is both woefully bad and non-performant. Apache's implementation is only slightly better. Both should only be used for relatively small amounts of data.
The output: The CryptoJS returns a word-array while the commons-codec one returns a byte array. For keys this doesn't matter much, as Java usually expects a byte array for SecretKeySpec while CryptoJS directly uses a word array as key.

QT what are the QML/C++ data types which can be converted to JSON?

According to http://doc.qt.io/qt-5/qtwebchannel-javascript.html
Furthermore keep in mind that only QML/C++ data types which can be converted to JSON will be (de-)serialized properly and thus accessible to HTML clients.
What are those data types which can be converted to JSON?
Is the QJsonObject or QJsonDocument included on it?
You can look into the documentation for classes such as QJsonValue and QJsonObject and see which types and classes can be used by constructors or by from*(...) functions, which are usually static and ask for a QVariant/QVariantHash/QVariantMap.
Given that in Qt JavaScript an array can be converted into a QList<> and an object to a QVariantMap, I would guess those (and basic types such as int, float, string...) should be passed to the C++ side and made into QJson(Value/Object/Array) then.
Depending on what you want, a QJsonObject could be, for example, formatted as a string like this. For further information, JSON support in Qt.

using regexp on raw binary data

I'm embedding JavaScript in my C++ app (via V8) and I get some raw binary data which I want to pass to JavaScript. Now, in the JavaScript, I plan to do some regular expressions on the data.
When using just the standard JavaScript String object for my data, everything is quite straight-forward. However, as far as I understand it, it uses an UTF16 representation and expects the data to be valid Unicode. But I have arbitrary data (might contain '\0' and other raw data - although it is just text for the most part).
How should I handle this? I searched a bit around and maybe ArrayBuffer or something like this is the object I need to store my raw data. However, I didn't found how to do the usual regular expression methods on that object. (Basically I need RegExp.test and RegExp.exec).
I just checked out the Node.js code and it seems as if they support binary data and just put it into a string via v8::String::NewFromOneByte. See here and here. So that would answer my question (i.e., I can just use String), wouldn't it? Any downsides?
(I still don't see why my question is bad. Please explain the downvote.)
From all my current tests, it seems like it works just as expected with normal String.
You can even specify that in JavaScript directly, e.g.
var s = "\x00\x01\x02\x03"
and regular expressions on that string work like expected.
On the C++ side, if you want to get your binary data into a JS String object:
v8::Local<v8::String> jsBinary(const uint8_t* data, uint32_t len) {
assert(int(len) >= 0);
return String::NewFromOneByte(v8::Isolate::GetCurrent(), data, String::kNormalString, len);
}

Why JSON allows only string to be a key?

Why does JSON only allow a string to be a key of a pair? Why not other types such as null, number, bool, object, array? Considering JSON is tightly related with JavaScript, could I conclude the reason from JavaScript specification (ECMA-262)? I'm totally a newbie to JavaScript, could you help me to point it out.
The JSON format is deliberately based on a subset of JavaScript object literal syntax and array literal syntax, and JavaScript objects can only have strings as keys - thus JSON keys are strings too. (OK, you can sort of use numbers as JavaScript object keys, but really they get converted to strings.)
Note that the point of JSON is that it is a string representation of data to allow easy exchange between programs written in different languages running on different machines in different environments. If you wanted to use an object as a key then that object would in turn have to be somehow represented as a string for transmission, but then the receiving language would need to be able to use objects as keys and that would mean you'd need a limited subset of JSON for those languages which would just be a mess.
"Considering JSON is a part of JavaScript"
No, it isn't. Newer browsers provide methods for creating and parsing JSON, but they're not part of the language as such except that JSON is a string format and JavaScript can do strings. JSON is always a string representation - it has to be parsed to create an object for use within JavaScript (or other languages) and once that happens JavaScript (or the other languages) treat the resulting object the same as any other object.
(Note also that a particular bit of JSON doesn't necessarily have any keys at all: it could just be an array, like '["one","two","three"]'.)
Main reason according to the discoverer of JSON representation is,
while parsing JSON data, there is a chance/possibility that, key
you are using to refer a value might be a reserved word in your
parsing language.
Refer this talk by Douglas Crockford, who is the discoverer of JSON representation.
Example :
{
id: 1234,
name: "foo",
do: "somthing"
}
Since JSON is a cross language compatibility, We can use this data set in many languages. But, the word do is a keyword in Javascript. It will end up in syntax error while parsing.
Because that is the way the specification was written.

Can JSON seamlessly be "eval"ed into Javascript?

JSON is said to be a subset of Javascript.
Is it then safe to assume that all JSON (assuming it's properly encoded) can be safely evaled into Javascript?
For instance, the ampersand & as a string gets JSON encoded into "\u0026".
Is it safe (both theoretically and in practice including old browsers) to assume that this, written in Javascript, will be for all intents equivalent to &?

Categories

Resources