Node.JS Big-Endian UCS-2

Node.JS Big-Endian UCS-2 - javascript

I'm working with Node.JS. Node's buffers support little-endian UCS-2, but not big-endian, which I need. How would I do so?

According to wikipedia, UCS-2 should always be big-endian so it's odd that node only supports little endian. You might consider filing a bug. That said, switching endian-ness is fairly straight-forward since it's just a matter of byte order. So just swap bytes around to go back and forth between little and big endian, like so:
function swapBytes(buffer) {
var l = buffer.length;
if (l & 0x01) {
throw new Error('Buffer length must be even');
}
for (var i = 0; i < l; i += 2) {
var a = buffer[i];
buffer[i] = buffer[i+1];
buffer[i+1] = a;
}
return buffer;
}

Related

Floating-point value from bytes reading completely wrong - Node.js

I am attempting to read a 32-bit IEEE-754 float from a buffer, but it is not reading correctly at all. For example, [00 00 00 3f] becomes 8.828180325246348e-44 instead of 0.5. I have also noticed that negative floats never convert properly. For example, [00 00 20 c1] becomes 1.174988762336359e-41, not -10.0. What am I doing wrong? Is this some floating-point precision issue? This is my code:
function readFloat() {
const value = this.data.readFloatBE(this.offset);
this.offset += 4;
return value;
}
this.data being a Buffer, this.offset being the offset currently read in bytes.
One thing to note is that even with something like this in vanilla JavaScript, I get the same results:
function floatFromBytes(bytes) {
buf = new ArrayBuffer(4);
v = new DataView(buf);
bytes.forEach((b, i) => {
v.setUint8(i, b);
})
return v.getFloat32(0);
}
floatFromBytes([0x00, 0x00, 0x20, 0xc1]); // should be -0.5, but is 1.174988762336359e-41
EDIT: Resolved, turns out the bytes were reversed for some reason.
New code:
function readFloat() {
// This is a bit of a weird IEEE 754 float implementation, but it works
let buf = new ArrayBuffer(4);
let view = new DataView(buf);
let bytes = this.readBytes(4);
// reverse the bytes
for (let i = 0; i < 4; i++) {
view.setUint8(i, bytes[3 - i]);
}
return view.getFloat32(0);
}

As you noted this is just an endian issue. Different systems expect bytes to be ordered in different ways, the most common ordering at the moment is little endian (used by Intel x86 compatible processors, and ARM systems are commonly set to use this mode).
Because JavaScript tries to be CPU agnostic it asks you to choose which order you want to interpret things. The BE in Buffer.readFloatBE stands for big-endian, and there's also a LE version which is what you probably want to use.
For example:
Buffer.from('0000003f','hex').readFloatLE() // => 0.5
Buffer.from('000020c1','hex').readFloatLE() // => -10.0

Struct operations in Javascript through Emscripten

I am having quite a lot of problems with emscripten inter-operating between C and Javascript.
More specifically, I am having trouble accessing a struct created in C in javascript, given that the pointer to the struct is passed into javascript as an external library.
Take a look at the following code:
C:
#include <stdlib.h>
#include <stdio.h>
#include <inttypes.h>
struct test_st;
extern void read_struct(struct test_st *mys, int siz);
struct test_st{
uint32_t my_number;
uint8_t my_char_array[32];
};
int main(){
struct test_st *teststr = malloc(sizeof(struct test_st));
teststr->my_number = 500;
for(int i = 0; i < 32; i++){
teststr->my_char_array[i] = 120 + i;
}
for(int i = 0; i < 32; i++){
printf("%d\n",teststr->my_char_array[i]);
}
read_struct(teststr,sizeof(teststr));
return 0;
}
Javascript:
mergeInto(LibraryManager.library,
{
read_struct: function(mys,siz){
var read_ptr = 0;
console.log("my_number: " + getValue(mys + read_ptr, 'i32'));
read_ptr += 4;
for(var i = 0; i < 32; i++){
console.log("my_char[" + i + "]: " + getValue(mys + read_ptr, 'i8'));
read_ptr += 1;
};
},
});
This is then compiled using emcc cfile.c --js-library jsfile.js.
The issue here is that you can't really read structs in javascript, you have to get memory from the respective addresses according to the size of the struct field (so read 4 bytes from the uint32_t and 1 byte from the uint8_t). Ok, that wouldn't be an issue, except you also have to state the LLVM IR type for getValue to work, and it doesn't include unsigned types, so in the case of the array, it will get to 127 and overflow to -128, when the intended behaviour is to keep going up, since the variable is unsigned.
I looked everywhere for an answer but apparently this specific intended behaviour is not common. Changing the struct wouldn't be possible in the program I'm applying this to (not the sample one above).

One way is to use the HEAP* typed arrays exposed by Emscripten, which do have unsigned views:
mergeInto(LibraryManager.library, {
read_struct: function(myStructPointer, size) {
// Assumes the struct starts on a 4-byte boundary
var myNumber = HEAPU32[myStructPointer/4];
console.log(myNumber);
// Assumes my_char_array is immediately after my_number with no padding
var myCharArray = HEAPU8.subarray(myStructPointer+4, myStructPointer+4+32);
console.log(myCharArray);
}
});
This works in my test, running Emscripten 1.29.0-64bit, but as noted it makes assumptions about alignment/padding. The cases I tested seemed to show that a struct seemed to always start on a 4 byte boundary, and that 32 bit unsigned integers inside a struct were also always aligned on a 4 byte boundary, and so accessible by HEAPU32.
However, it's beyond my knowledge to know if you can depend on this behaviour in Emscripten. It's my understanding that you can't in usual C/C++ world.

Retrieving binary data in Javascript (Ajax)

Im trying to get this remote binary file to read the bytes, which (of course) are supossed to come in the range 0..255. Since the response is given as a string, I need to use charCodeAt to get the numeric values for every character. I have come across the problem that charCodeAt returns the value in UTF8 (if im not mistaken), so for example the ASCII value 139 gets converted to 8249. This messes up my whole application cause I need to get those value as they are sent from the server.
The immediate solution is to create a big switch that, for every given UTF8 code will return the corresponding ASCII. But i was wondering if there is a more elegant and simpler solution. Thanks in advance.

The following code has been extracted from an answer to this StackOverflow question and should help you work around your issue.
function stringToBytesFaster ( str ) {
var ch, st, re = [], j=0;
for (var i = 0; i < str.length; i++ ) {
ch = str.charCodeAt(i);
if(ch < 127)
{
re[j++] = ch & 0xFF;
}
else
{
st = []; // clear stack
do {
st.push( ch & 0xFF ); // push byte to stack
ch = ch >> 8; // shift value down by 1 byte
}
while ( ch );
// add stack contents to result
// done because chars have "wrong" endianness
st = st.reverse();
for(var k=0;k<st.length; ++k)
re[j++] = st[k];
}
}
// return an array of bytes
return re;
}
var str = "\x8b\x00\x01\x41A\u1242B\u4123C";
alert(stringToBytesFaster(str)); // 139,0,1,65,65,18,66,66,65,35,67

I would recommend encoding the binary data is some character-encoding independent format like base64

Decrypting images using JavaScript within browser

I have a web based application that requires images to be encrypted before they are sent to server, and decrypted after loaded into the browser from the server, when the correct key was given by a user.
[Edit: The goal is that the original image and the key never leaves the user's computer so that he/she is not required to trust the server.]
My first approach was to encrypt the image pixels using AES and leave the image headers untouched. I had to save the encrypted image in lossless format such as png. Lossy format such as jpg would alter the AES encrypted bits and make them impossible to be decrypted.
Now the encrypted images can be loaded into the browser, with a expected completely scrambled look. Here I have JavaScript code to read in the image data as RGB pixels using Image.canvas.getContext("2d").getImageData(), get the key form the user, decrypt the pixels using AES, redraw the canvas and show the decrypted image to the user.
This approach works but suffers two major problems.
The first problem is that saving the completely scrambled image in lossless format takes a lot of bytes, close to 3 bytes per pixel.
The second problem is that decrypting large images in the browser takes a long time.
This invokes the second approach, which is to encrypt the image headers instead of the actual pixels. But I haven't found any way to read in the image headers in JavaScript in order to decrypt them. The Canvas gives only the already decompressed pixel data. In fact, the browser shows the image with altered header as invalid.
Any suggestions for improving the first approach or making the second approach possible, or providing other approaches are much appreciated.
Sorry for the long post.

You inspired me to give this a try. I blogged about it and you can find a demo here.
I used Crypto-JS to encrypt and decrypt with AES and Rabbit.
First I get the CanvasPixelArray from the ImageData object.
var ctx = document.getElementById('leif')
.getContext('2d');
var imgd = ctx.getImageData(0,0,width,height);
var pixelArray = imgd.data;
The pixel array has four bytes for each pixel as RGBA but Crypto-JS encrypts a string, not an array. At first I used .join() and .split(",") to get from array to string and back. It was slow and the string got much longer than it had to be. Actually four times longer. To save even more space I decided to discard the alpha channel.
function canvasArrToString(a) {
var s="";
// Removes alpha to save space.
for (var i=0; i<pix.length; i+=4) {
s+=(String.fromCharCode(pix[i])
+ String.fromCharCode(pix[i+1])
+ String.fromCharCode(pix[i+2]));
}
return s;
}
That string is what I then encrypt. I sticked to += after reading String Performance an Analysis.
var encrypted = Crypto.Rabbit.encrypt(imageString, password);
I used a small 160x120 pixels image. With four bytes for each pixels that gives 76800 bytes. Even though I stripped the alpha channel the encrypted image still takes up 124680 bytes, 1.62 times bigger. Using .join() it was 384736 bytes, 5 times bigger. One cause for it still being larger than the original image is that Crypto-JS returns a Base64 encoded string and that adds something like 37%.
Before I could write it back to the canvas I had to convert it to an array again.
function canvasStringToArr(s) {
var arr=[];
for (var i=0; i<s.length; i+=3) {
for (var j=0; j<3; j++) {
arr.push(s.substring(i+j,i+j+1).charCodeAt());
}
arr.push(255); // Hardcodes alpha to 255.
}
return arr;
}
Decryption is simple.
var arr=canvasStringToArr(
Crypto.Rabbit.decrypt(encryptedString, password));
imgd.data=arr;
ctx.putImageData(imgd,0,0);
Tested in Firefox, Google Chrome, WebKit3.1 (Android 2.2), iOS 4.1, and a very recent release of Opera.

Encrypt and Base64 encode the image's raw data when it is saved. (You can only do that on a web browser that supports the HTML5 File API unless you use a Java applet). When the image is downloaded, unencode it, decrypt it, and create a data URI for the browser to use (or again, use a Java applet to display the image).
You cannot, however, remove the need for the user to trust the server because the server can send whatever JavaScript code it wants to to the client, which can send a copy of the image to anyone when it is decrypted. This is a concern some have with encrypted e-mail service Hushmail – that the government could force the company to deliver a malicious Java applet. This isn't an impossible scenario; telecommunications company Etisalat attempted to intercept BlackBerry communications by installing spyware onto the device remotely (http://news.bbc.co.uk/2/hi/technology/8161190.stm).
If your web site is one used by the public, you have no control over your users' software configurations, so their computers could even already be infected with spyware.

I wanted to do something similar: On the server is an encrypted gif and I want to download, decrypt, and display it in javascript. I was able to get it working and the file stored on the server is the same size as the original plus a few bytes (maybe up to 32 bytes). This is the code that performs AES encryption of the file calendar.gif and makes calendar.gif.enc, written in VB.Net.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim AES As New System.Security.Cryptography.RijndaelManaged
Dim encryption_key As String = "603deb1015ca71be2b73aef0857d77811f352c073b6108d72d9810a30914dff4"
AES.Key = HexStringToBytes(encryption_key)
Dim iv_string As String = "000102030405060708090A0B0C0D0E0F"
'System.IO.File.ReadAllBytes("calendar.gif")
'Dim test_string As String = "6bc1bee22e409f96e93d7e117393172a"
AES.Mode = Security.Cryptography.CipherMode.CBC
AES.IV = HexStringToBytes(iv_string)
Dim Encrypter As System.Security.Cryptography.ICryptoTransform = AES.CreateEncryptor
Dim b() As Byte = System.IO.File.ReadAllBytes("calendar.gif")
System.IO.File.WriteAllBytes("calendar.gif.enc", (Encrypter.TransformFinalBlock(System.IO.File.ReadAllBytes("calendar.gif"), 0, b.Length)))
End Sub
This is the javascript code that downloads calendar.gif.enc as binary, decrypts, and makes an image:
function wordArrayToBase64(wordArray) {
var words = wordArray.words;
var sigBytes = wordArray.sigBytes;
// Convert
var output = "";
var chr = [];
for(var i = 0; i < sigBytes; i++) {
chr.push((words[i >>> 2] >>> (24 - (i % 4) * 8)) & 0xff);
if(chr.length == 3) {
var enc = [
(chr[0] & 0xff) >> 2,
((chr[0] & 3) << 4) | ((chr[1] & 0xff) >> 4),
((chr[1] & 15) << 2) | ((chr[2] & 0xff) >> 6),
chr[2] & 63
];
for(var j = 0; j < 4; j++) {
output += Base64._keyStr.charAt(enc[j]);
}
chr = [];
}
}
if(chr.length == 1) {
chr.push(0,0);
var enc = [
(chr[0] & 0xff) >> 2,
((chr[0] & 3) << 4) | ((chr[1] & 0xff) >> 4),
((chr[1] & 15) << 2) | ((chr[2] & 0xff) >> 6),
chr[2] & 63
];
enc[2] = enc[3] = 64;
for(var j = 0; j < 4; j++) {
output += Base64._keyStr.charAt(enc[j]);
}
} else if(chr.length == 2) {
chr.push(0);
var enc = [
(chr[0] & 0xff) >> 2,
((chr[0] & 3) << 4) | ((chr[1] & 0xff) >> 4),
((chr[1] & 15) << 2) | ((chr[2] & 0xff) >> 6),
chr[2] & 63
];
enc[3] = 64;
for(var j = 0; j < 4; j++) {
output += Base64._keyStr.charAt(enc[j]);
}
}
return(output);
}
var xhr = new XMLHttpRequest();
xhr.overrideMimeType('image/gif; charset=x-user-defined');
xhr.onreadystatechange = function() {
if(xhr.readyState == 4) {
var key = CryptoJS.enc.Hex.parse('603deb1015ca71be2b73aef0857d77811f352c073b6108d72d9810a30914dff4');
var iv = CryptoJS.enc.Hex.parse('000102030405060708090A0B0C0D0E0F');
var aesEncryptor = CryptoJS.algo.AES.createDecryptor(key, { iv: iv });
var words = [];
for(var i=0; i < (xhr.response.length+3)/4; i++) {
var newWord = (xhr.response.charCodeAt(i*4+0)&0xff) << 24;
newWord += (xhr.response.charCodeAt(i*4+1)&0xff) << 16;
newWord += (xhr.response.charCodeAt(i*4+2)&0xff) << 8;
newWord += (xhr.response.charCodeAt(i*4+3)&0xff) << 0;
words.push(newWord);
}
var inputWordArray = CryptoJS.lib.WordArray.create(words, xhr.response.length);
var ciphertext0 = aesEncryptor.process(inputWordArray);
var ciphertext1 = aesEncryptor.finalize();
$('body').append('<img src="data:image/gif;base64,' + wordArrayToBase64(ciphertext0.concat(ciphertext1)) + '">');
$('body').append('<p>' + wordArrayToBase64(ciphertext0.concat(ciphertext1)) + '</p>');
}
};
Caveats:
I used a fixed IV and fixed password. You should modify the code to generate a random IV during encryption and prepend them as the first bytes of the output file. The javascript needs to be modified, too, to extract these bytes.
The password length should be fixed: 256-bits for AES-256. If the password isn't 256 bytes, one possibility is to use AES hashing to hash the password to 256 bits in length in both encryption and decryption.
You'll need crypto-js.
overrideMimeType might not work on older browsers. You need this so that the binary data will get downloaded properly.

Javascript converting text from greek to UTF-8

I am attempting to help my teacher convert a Greek textbook into an online application. Part of this includes taking a Shapefile ( draws polygons on maps, along with descriptions of the polygons. ) and mapping everything on this map. I cannot directly access the part of the shapefile file that has the data I need to convert due to it being in hexadecimal.
Anyways, here is the code that I am printing to my console.
console.log((arr[1][i]['PERIOD']);
"arr" is the data array that contains all of the properties that I want to convert from Greek into UTF-8. I am only printing "PERIOD", rather than the 12 other propierties that are associated with the array.
When I run my page, the console returns several variations of text(as there exist several periods.) Here is an example of the text it returns.
ÎÏÏÎ±ÏÎºÎ®, ÎÎ»Î±ÏÎ¹ÎºÎ®, ÎÎ»Î»Î·Î½Î¹ÏÏÎ¹ÎºÎ®
Î¡ÏÎ¼Î±ÏÎºÎ®
ÎÎ¸ÏÎ¼Î±Î½Î¹ÎºÎ®
Î¥ÏÏÎµÏÎ¿Î²ÏÎ¶Î±Î½ÏÎ¹Î½Î®
Believe it or not, but this is not Greek text. So I snooped around and found this function to convert to utf-8:
function encode_utf8( s ){
return unescape(encodeURI( s ));
}
When I add this function to my console.log, this is what I get:
ÃÂ¡ÃÂÃÂ¼ÃÂ±ÃÂÃÂºÃÂ®
ÃÂÃÂ¸ÃÂÃÂ¼ÃÂ±ÃÂ½ÃÂ¹ÃÂºÃÂ®
ÃÂ¥ÃÂÃÂÃÂµÃÂÃÂ¿ÃÂ²ÃÂÃÂ¶ÃÂ±ÃÂ½ÃÂÃÂ¹ÃÂ½ÃÂ®
ÃÂÃÂ¸ÃÂÃÂ¼ÃÂ±ÃÂ½ÃÂ¹ÃÂºÃÂ®
I am not 100% positive but I think that the text I am trying to convert is currently in ISO-8859-7.
Any help with this would be amazing.
Thank you.

You quite easily can build a map of the bytes of one char set to another (although it can get tedious)
Assuming ISO 8859-7 which is only 256 bytes long so not too difficult,
function genCharMap() { // ISO 8859-7 to Unicode
var map = [], i, j, str;
map.length = 256;
map[0] = 0; // fill in 0
str = '\u2018\u2019\u00a3\u20ac\u20af\u00a6\u00a7\u00a8\u00a9\u037a\u00ab\u00ac\u00ad\u00ae\u2015\u00b0\u00b1\u00b2\u00b3\u0384\u0385\u0386\u00b7\u0388\u0389\u038a\u00bb\u038c\u00bd\u038e';
for (i = 0; i < str.length; ++i) // fill in 0xA1 to 0xBE
map[0xA1 + i] = str.charCodeAt(i);
for (i = 0; i < 256; ++i) // fill in blanks
if (i in map) j = map[i] - i;
else map[i] = j + i;
return map;
}
Now you can apply this transformation to your bytes
var byteArr = [0xC1, 0xE2, 0xE3, 0xE4], // Αβγδ
str_out = '',
i,
map = genCharMap();
for (i = 0; i < byteArr.length; ++i) {
str_out += String.fromCharCode(
map[byteArr[i]]
);
}
str_out; // "Αβγδ"
If you're re-writing this code for a charset with "combining chars" it may be safer to swap the str I used in genCharMap for an Array of numbers instead.

Develop Reference

JavaScript is the programming language of the Web.