Bits flipped between my webassembly binary and my javascript code? - javascript

Rust Web Assembly code
let size = (width * height) as usize;
let mut cells = FixedBitSet::with_capacity(size);
for i in 0..size {
cells.set(i, i == 0);
}
Javascript code
Converting bits from memory buffer to a javascript 8-bit array
const cells = new Uint8Array(memory.buffer, cellsPtr, (width * height) / 8);
When I print the FixedBitSet, I get the following:
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
When I print the javascript array I get:
[ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, … ]
Meaning javascript is interpreting it as 00000001.
Why is this happening?

I've not looked at the code of FixedBitSet, but I would assume that internally it's a buffer of integers and it just uses shifts to address each bit.
This means the simplest implementation would be that bit n is in element n / k (where k is the number of bits in whatever integer size it picked), then shifted by n % k.
For the sake of arguments let's assume k = 8 (aka the FBS uses bytes internally), bytes are traditionally printed in big-endian, meaning the most significant bit first and the least significant bit last.
Given a bitset of size 32:
00000000 00000000 00000000 00000000
you're setting bit 0, n / k = 0 and n % k = 0, the first bit of the first byte is going to be... the right-most one:
00000001 00000000 00000000 00000000
which, interpreted as an array of u8, is [1, 0, 0, 0].
Now I don't know if that's what happens internally in FixedBitSet, but it seems like the most straightforward implementation.
And because WebAssembly is defined as a little endian virtual machine the exact same behaviour would occur even if FBS uses wider integers e.g. if it's u32, bit 0 would be the least significant bit (= last bit) of the least significant byte (= first byte).

Related

I have the encode function for an ascii cipher, but need help to how to decode it

I am working on a puzzle which involves decoding a cipher. I have been given the function used to encode the ciphertext, but I am not sure how I can reverse engineer this to find the decode function.
This is the function, in JavaScript:
function encodeText(ascii,x,y,z) {
for (i=0; i<ascii.length; i++) {
if(i%3==0){
ascii[i]= (ascii[i]+x) % 256;
}
if(i%3==1){
ascii[i]=(ascii[i]+y) % 256;
}
if(i%3==2){
ascii[i]=(ascii[i]+z) % 256;
}
}
return ascii;
}
I am aware that it returns an array of numbers, which is the cipher text I have been presented with.
Help is greatly appreciated!
This is a Vigenère Cipher with a key of length three (x, y, z).
ascii[i] and x are both smaller than 256, but (ascii[i]+x) might result in a number larger than 255. That is why the modulo operator flips it back into the range 0-255.
The decryption would look like (ascii[i] - x + 256) % 256.
ascii[i] - x is a direct opposite of ascii[i] + x, so I assume this is clear. ascii[i] may be a number smaller than x. In that case, ascii[i] - x might be less than zero. The problem is that -5 % 256 is -5 and not 251 (example). So, we need to move all possible results to the positive numbers that then apply the modulo operation.
For example, (74 - 80 + 256) is 250 and 250 % 256 is still 250. Furthermore, (84 - 80 + 256) is 260 and 260 % 256 is 4 which is the difference.
A Caesar Cipher and a Vigenère Cipher are usually broken by frequency analysis. Your instance of a Vigenère Cipher consists of three Caesar Ciphers. The key space is 256*256*256 = 16,777,216 which is too big to iterate over and read every possible recovered plaintext.
You should know what language the plaintext is in. If you have that, then you can split your ciphertext in three strings. The 1st, the 4th, the 7th and so on characters go into the first string. The 2nd, the 5th, the 8th and so on characters go into the second string. The 3rd, the 6th, the 9th and so on character go into the third string. Now, you can do on each of those three resulting strings a Frequency analysis and compare the results to the letter frequency of your target language (english example).
If the ciphertext is sufficiently long, you should be able to determine which encrypted letters are e and n. When you have that, the shift is linear. Let's say that the letter that you identified as e has an ordinal value of 43. If you look into an ASCII table, the actual lowercase e has an ordinal value of 101. That would mean that you have to calculate 43 - 101 = -58. Since this has to be positive, we can add 256 to get x = 198. You should repeat that process for y (second string) and z (third string), too.

Modifying image pixels using bitwise operators (JSFeat)

I am using the JSFeat Computer Vision Library and am trying to convert an image to greyscale. The function jsfeat.imgproc.grayscale outputs to a matrix (img_u8 below), where each element is an integer between 0 and 255. I was unsure how to apply this matrix to the original image so I went looking through their example at https://inspirit.github.io/jsfeat/sample_grayscale.htm.
Below is my code to convert an image to grey scale. I adopted their method to update the pixels in the original image but I do not understand how it works.
/**
* I understand this stuff
*/
let canvas = document.getElementById('canvas');
let ctx = canvas.getContext('2d');
let img = document.getElementById('img-in');
ctx.drawImage(img, 0, 0, img.width, img.height);
let imageData = ctx.getImageData(0, 0, img.width, img.height);
let img_u8 = new jsfeat.matrix_t(img.width, img.height, jsfeat.U8C1_t);
jsfeat.imgproc.grayscale(imageData.data, img.width, img.height, img_u8);
let data_u32 = new Uint32Array(imageData.data.buffer);
let i = img_u8.cols*img_u8.rows, pix = 0;
/**
* Their logic to update the pixel values of the original image
* I need help understanding how the following works
*/
let alpha = (0xff << 24);
while(--i >= 0) {
pix = img_u8.data[i];
data_u32[i] = alpha | (pix << 16) | (pix << 8) | pix;
}
/**
* I understand this stuff
*/
context.putImageData(imageData, 0, 0);
Thanks in advance!
It's a wide topic, but I'll try to roughly cover the basics in order to understand what goes on here.
As we know, it's using 32-bit integer values which means you can operate on four bytes simultaneously using fewer CPU instructions and therefor in many cases can increase overall performance.
Crash course
A 32-bit value is often notated as hex like this:
0x00000000
and represents the equivalent of bits starting with the least significant bit 0 on the right to the most significant bit 31 on the left. A bit can of course only be either on/set/1 or off/unset/0. 4 bits is a nibble, 2 nibbles are one byte. The hex value has each nibble as one digit, so here you have 8 nibbles = 4 bytes or 32 bits. As in decimal notation, leading 0s have no effect on the value, i.e. 0xff is the same as 0x000000ff (The 0x prefix also has no effect on the value; it is just the traditional C notation for hexadecimal numbers which was then taken over by most other common languages).
Operands
You can bit-shift and perform logic operations such as AND, OR, NOT, XOR on these values directly (in assembler language you would fetch the value from a pointer/address and load it into a registry, then perform these operations on that registry).
So what happens is this:
The << means bit-shift to the left. In this case the value is:
0xff
or in binary (bits) representation (a nibble 0xf = 1111):
0b11111111
This is the same as:
0x000000ff
or in binary (unfortunately we cannot denote bit representation natively in JavaScript actually, there is the 0b-prefix in ES6):
0b00000000 00000000 00000000 11111111
and is then bit-shifted to the left 24 bit positions, making the new value:
0b00000000 00000000 00000000 11111111
<< 24 bit positions =
0b11111111 00000000 00000000 00000000
or
0xff000000
So why is this necessary here? Well, that's an excellent question!
The 32-bit value in relation to canvas represents RGBA and each of the components can have a value between 0 and 255, or in hex a value between 0x00 and 0xff. However, since most consumer CPUs today uses little-endian byte order each components for the colors is at memory level stored as ABGR instead of RGBA for 32-bit values.
We are normally abstracted away from this in a high-level language such as JavaScript of course, but since we now work directly with memory bytes through typed arrays we have to consider this aspect as well, and in relation to registry width (here 32-bits).
So here we try to set alpha channel to 255 (fully opaque) and then shift it 24 bits so it becomes in the correct position:
0xff000000
0xAABBGGRR
(Though, this is an unnecessary step here as they could just as well have set it directly as 0xff000000 which would be faster, but anyhoo).
Next we use the OR (|) operator combined with bit-shift. We shift first to get the value in the correct bit position, then OR it onto the existing value.
OR will set a bit if either the existing or the new bit is set, otherwise it will remain 0. F.ex starting with an existing value, now holding the alpha channel value:
0xff000000
We then want the blue component of say value 0xcc (204 in decimal) combined which currently is represented in 32-bit as:
0x000000cc
so we need to first shift it 16 bits to the left in this case:
0x000000cc
<< 16 bits
0x00cc0000
When we now OR that value with the existing alpha value we get:
0xff000000
OR 0x00cc0000
= 0xffcc0000
Since the destination is all 0 bits only the value from source (0xcc) is set, which is what we want (we can use AND to remove unwanted bits but, that's for another day).
And so on for the green and red component (the order which in they are OR'ed doesn't matter so much).
So this line then does, lets say pix = 0xcc:
data_u32[i] = alpha | (pix << 16) | (pix << 8) | pix;
which translates into:
alpha = 0xff000000 Alpha
pix = 0x000000cc Red
pix << 8 = 0x0000cc00 Green
pix << 16 = 0x00cc0000 Blue
and OR'ed together would become:
value = 0xffcccccc
and we have a grey value since all components has the same value. We have the correct byte-order and can write it back to the Uint32 buffer using a single operation (in JS anyways).
You can optimize this line though by using a hard-coded value for alpha instead of a reference now that we know what it does (if alpha channel vary then of course you would need to read the alpha component value the same way as the other values):
data_u32[i] = 0xff000000 | (pix << 16) | (pix << 8) | pix;
Working with integers, bits and bit operators is as said a wide topic and this just scratches the surface, but hopefully enough to make it more clear what goes on in this particular case.

JavaScript Typed Arrays - Different Views - 2

I posted a couple of questions on this a few days ago and got some excellent replies JavaScript Typed Arrays - Different Views
My second question involved two views, 8-bit array and 32-bit array of a buffer. By placing 0, 1, 2, 3, in the 8-bit I got 50462976 in the 32-bit. As mentioned the reason for the 32-bit value was well explained.
I can achieve the same thing with the following code:
var buf = new ArrayBuffer(4);
var arr8 = new Int8Array(buf);
var arr32 = new Int32Array(buf);
for (var x = 0; x < buf.byteLength; x++) {
arr8[x] =
(x << 24) |
(x << 16) |
(x << 8) |
x;
}
console.log(arr8); // [0, 1, 2, 3]
console.log(arr32); // [50462976]
I can't find anything that explains the mechanics of this process. It seems to be saying that each arr8 element equals X bit-shifted 24 positions OR bit-shifted 16 positions OR bit-shifted 8 positions OR not bit-shifted.
That doesn't really make sense to me. I'd appreciate it if someone could shed some light on this.
Thanks,
Basically, your buffer is like this:
00000000 00000001 00000010 00000011
When handled as an Int8Array, it reads each 8-bit group individually: 0, 1, 2, 3
When handled as an Int32Array, it reads 32-bit groups (ie. 4 8-bit groups) to get 50462976
The memory used by the buffer is interpreted as 8-bit bytes for the Int8Array and 32-bit words for the Int32Array. The ordering of the bytes in the 8-bit array is the same as the ordering of the bytes in the single 32-bit word in the other array because they're the same bytes. There are no "mechanics" involved; it's just two ways of looking at the same 4 bytes of memory.
You get the exact same effect in C if you allocate a four-byte array and then create an int pointer to the same location.
Furthermore, this expression here:
arr8[x] =
(x << 24) |
(x << 16) |
(x << 8) |
x;
will do precisely the same thing as
arr8[x] = x;
You're shifting the value of x up into ranges that will be truncated away when the value is actually saved into the (8-bit) array element.

what does the symbol >> means in javascript

What does the >> symbol mean? On this page, there's a line that looks like this:
var i = 0, l = this.length >> 0, curr;
It's bitwise shifting.
Let's take the number 7, which in binary is 0b00000111
7 << 1 shifts it one bit to the left, giving you 0b00001110, which is 14
Similarly, you can shift to the right: 7 >> 1 will cut off the last bit, giving you 0b00000011 which is 3.
[Edit]
In JavaScript, numbers are stored as floats. However, when shifting you need integer values, so using bit shifting on JavaScript values will convert it from float to integer.
In JavaScript, shifting by 0 bits will round the number down* (integer rounding) (Better phrased: it will convert the value to integer)
> a = 7.5;
7.5
> a >> 0
7
*: Unless the number is negative.
Sidenote: since JavaScript's integers are 32-bit, avoid using bitwise shifts unless you're absolutely sure that you're not going to use large numbers.
[Edit 2]
this.length >> 0 will also make a copy of the number, instead of taking a reference to it. Although I have no idea why anyone would want that.
Just like in many other languages >> operator (among << and >>>) is a bitwise shift.

Convert a string with a hex representation of an IEEE-754 double into JavaScript numeric variable

Suppose I have a hex number "4072508200000000" and I want the floating point number that it represents (293.03173828125000) in IEEE-754 double format to be put into a JavaScript variable.
I can think of a way that uses some masking and a call to pow(), but is there a simpler solution?
A client-side solution is needed.
This may help. It's a website that lets you enter a hex encoding of an IEEE-754 and get an analysis of mantissa and exponent.
http://babbage.cs.qc.edu/IEEE-754/64bit.html
Because people always tend to ask "why?," here's why: I'm trying to fill out an existing but incomplete implementation of Google's Procol Buffers (protobuf).
I don't know of a good way. It certainly can be done the hard way, here is a single-precision example totally within JavaScript:
js> a = 0x41973333
1100428083
js> (a & 0x7fffff | 0x800000) * 1.0 / Math.pow(2,23) * Math.pow(2, ((a>>23 & 0xff) - 127))
18.899999618530273
A production implementation should consider that most of the fields have magic values, typically implemented by specifying a special interpretation for what would have been the largest or smallest. So, detect NaNs and infinities. The above example should be checking for negatives. (a & 0x80000000)
Update: Ok, I've got it for double's, too. You can't directly extend the above technique because the internal JS representation is a double, and so by its definition it can handle at best a bit string of length 52, and it can't shift by more than 32 at all.
Ok, to do double you first chop off as a string the low 8 digits or 32 bits; process them with a separate object. Then:
js> a = 0x40725082
1081233538
js> (a & 0xfffff | 0x100000) * 1.0 / Math.pow(2, 52 - 32) * Math.pow(2, ((a >> 52 - 32 & 0x7ff) - 1023))
293.03173828125
js>
I kept the above example because it's from the OP. A harder case is when the low 32-bits have a value. Here is the conversion of 0x40725082deadbeef, a full-precision double:
js> a = 0x40725082
1081233538
js> b = 0xdeadbeef
3735928559
js> e = (a >> 52 - 32 & 0x7ff) - 1023
8
js> (a & 0xfffff | 0x100000) * 1.0 / Math.pow(2,52-32) * Math.pow(2, e) +
b * 1.0 / Math.pow(2, 52) * Math.pow(2, e)
293.0319506442019
js>
There are some obvious subexpressions you can factor out but I've left it this way so you can see how it relates to the format.
A quick addition to DigitalRoss' solution, for those finding this page via Google as I did.
Apart from the edge cases for +/- Infinity and NaN, which I'd love input on, you also need to take into account the sign of the result:
s = a >> 31 ? -1 : 1
You can then include s in the final multiplication to get the correct result.
I think for a little-endian solution you'll also need to reverse the bits in a and b and swap them.
The new Typed Arrays mechanism allows you to do this (and is probably an ideal mechanism for implementing protocol buffers):
var buffer = new ArrayBuffer(8);
var bytes = new Uint8Array(buffer);
var doubles = new Float64Array(buffer); // not supported in Chrome
bytes[7] = 0x40; // Load the hex string "40 72 50 82 00 00 00 00"
bytes[6] = 0x72;
bytes[5] = 0x50;
bytes[4] = 0x82;
bytes[3] = 0x00;
bytes[2] = 0x00;
bytes[1] = 0x00;
bytes[0] = 0x00;
my_double = doubles[0];
document.write(my_double); // 293.03173828125
This assumes a little-endian machine.
Unfortunately Chrome does not have Float64Array, although it does have Float32Array. The above example does work in Firefox 4.0.1.

Categories

Resources