Nodejs Uint8Array,Uint16Array, Uint32Array [duplicate] - javascript

This question already has answers here:
Map binary buffer to Int16Array or Int32Array
(2 answers)
convert nodejs buffer to Uint16Array
(1 answer)
Is Uint8Array / Uint16Array conversion broken in Firefox OS simulator?
(1 answer)
Closed last year.
I'm trying to figure out how to work with the buffer's representation in int.
Simple code
var buffer = Buffer.alloc(4, 'a');
var interface16 = new Uint16Array(buffer);
var interface8 = new Uint8Array(buffer);
console.log(interface16);
console.log(interface8);
Why does the log output the same arrays (by value and length)? After all, I'm doing a data representation in the form of 1 and 2 bytes per integer. At least the manuals say that Uint8Array - takes 1 byte from the buffer, which I think of as a sequence of 0s and 1s, in this case 32 ( 4 * 8). That is, I can put up with the fact that it will be an array with 4 elements.
But Uint16Array takes 2 bytes for an integer, that is, the number must be different and the size of the array is 2.
What am I wrong about and which buffer to pass to these constructors in order to visually see the difference?
I suspect that in any case the returned array will be the same length as the number of bytes, or that it is a matter of how the console generates output. Probably. But there is not enough knowledge to understand why.
Thank you for your attention.
P.S. If you also advise some non-brain-exploding literature directly on this issue, it will be great.

In your example, both typed arrays end up having 4 elements, but the Uint8Array only uses 4 bytes to represent its contents, while the Uint8Array uses 8. You can see this by inspecting the byteLength property:
var buffer = Buffer.alloc(4, 'a');
var interface16 = new Uint16Array(buffer);
var interface8 = new Uint8Array(buffer);
console.log('Uint16 length: %d', interface16.length);
console.log('Uint16 byteLength: %d', interface16.byteLength);
console.log('Uint8 length: %d', interface8.length);
console.log('Uint8 byteLength: %d', interface8.byteLength);
Output
Uint16 length: 4
Uint16 byteLength: 8
Uint8 length: 4
Uint8 byteLength: 4
So the typed array constructors are creating typed arrays with the same number of elements as the source buffer, but the Uint16Array constructor is using two bytes instead of one for each element, filling the high-order byte of each element with zero.

first you create a buffer with 4 a's inside.
then you you call this function with the buffer :
the doc says : new Uint16Array(object);
When called with an object argument, a new typed array is created as if by the TypedArray.from() method.
see more : https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray/from
it creates an array of 4 uint16 by using every bytes of the buffer
a = 0x61 = 97
you then get four 97.
here is a stackoverflow of a sucessful buffer to uint16 conversion
convert nodejs buffer to Uint16Array

Related

Extract a buffer having different types of data with it

Can a buffer have both string and image associated with it? If so, how to extract them separately.
An example case would be a buffer with image data and also file name data.
I have worked with sharedArrayBuffers/arrayBuffers before.
If you are storing image pixel data, it's going to be a u32-int array, with 4 8-bit segment controlling rbga respectively... yes: you CAN tack on string data at the front in the form of a 'header' if you encode it and decode it to int values... but I have a hard time understanding why that might be desirable. because working with raw pixel data that is ONLY pixel-data is simpler. (I usually just stick it as a property of an object, with whatever other data I want to store)
Data Buffers
Typed arrays
You can use ArrayBuffer to create a buffer to hold the data. You then create a view using a typed array. eg unsigned characters Uint8Array. Types can be 8-16-32-64 bit (un/signed integers), float - double (32 - 64 bit floating point)
One buffer can have many view. You can read and write to view just like any JS array. The values are automatically converted to the correct type when you write to a buffer, and converted to Number when you read from a view
Example
Using buffer and views to read different data types
For example say you have file data that has a 4 character header, followed by a 16 bit unsigned integer chunk length, then 2 signed 16 bit integer coordinates, and more data
const fileBuffer = ArrayBuffer(fileSizeInBytes);
// Create a view of the buffer so we can fill it with file data
const dataRaw = new Uint8Array(data);
// load the data into dataRaw
// To get a string from the data we can create a util function
function readBufferString(buffer, start, length) {
// create a view at the position of the string in the buffer
const bytes = new Uint8Array(buffer, start, length);
// read each byte converting to JS unicode string
var str = "", idx = 0;
while (idx < length) { str += String.fromCharCode(bytes[idx++]) }
return str;
}
// get 4 char chunk header at start of buffer
const header = readBufferString(fileBuffer, 0, 4);
if (header === "HEAD") {
// Create views for 16 bit signed and unsigned integers
const ints = new Int16Array(fileBuffer);
const uints = new Uint16Array(fileBuffer);
const length = uints[2]; // get the length as unsigned int16
const x = ints[3]; // get the x coord as signed int16
const y = ints[4]; // get the y coord as signed int16
A DataView
The above example is one way of extracting the different types of data from a single buffer. However there could be an problem with older files and some data sources regarding the order of bytes that create multi byte types (eg 32 integers). This is called endianness
To help with using the correct endianness and to simplify access to all the different data types in a buffer you can use a DataView
The data view lets you read from the buffer by type and endianness. For example to read a unsigned 64bit integer from a buffer
// fileBuffer is a array buffer with the data
// Create a view
const dataView = new DataView(fileBuffer);
// read the 64 bit uint starting at the first byte in the buffer
// Note the returned value is a BigInt not a Number
const bInt = dataView.getBigUint64(0);
// If the int was in little endian order you would use
const bInt = dataView.getBigUint64(0, true); // true for little E
Notes
Buffers are not dynamic. That means they can not grow and shrink and that you must know how large the buffer needs to be when you create it.
Buffers tend to be a little slower than JavaScript's standard array as there is a lot type coercion when read or writing to buffers
Buffers can be transferred (Zero copy transfer) across threads making them ideal for distributing large data structures between WebWorkers. There is also a SharedArrayBuffer that lets you create true parallel processing solutions in JS

base64 or hex encoded Int8 binary string to Int32Array

I have a binary string that must travel over a ws websocket (I can't use websocket.io) and so is being JSON.stringified. e.g. var msg.data = data.toString('base64')
On the other end, I want that data back not as byte wide binary, but as an array of 32 bit integers. e.g. if the binary data is [0, 0, 0, 1] going in, I want [1] coming out. Each output element is 4 bytes.
If I just take the binary string directly, I can new Int32Array(data) and I'm golden; the result is 1/4 the length of the original and each 32 bit element is made up from 4 of the original byte wide elements.
But when I've encoded it, then decoded with var data = Buffer.from(msg.data, 'base64') then new Int32Array(data)is the same length as the original, and each 32 byte element is made from ONE of the original 8 byte elements. Int32Array.from(data) does the same.
I'm not finding any answer by searching, everyone appears to be ok with byte wide data.
.buffer
I was forgetting .buffer.
new Int32Array(data.buffer) works perfectly.

How do you decide which typed array to use?

I am trying to create a view of ArrayBuffer object in order to JSONify it.
var data = { data: new Uint8Array(arrayBuffer) }
var json = JSON.stringify(data)
It seems that the size of the ArrayBuffer does not matter even with the smallest Uint8Array. I did not get any RangeError so far:) If so, how do I decide which typed array to use?
You decide based on the data stored in the buffer, or, better said, based on your interpretation of that data.
Also an Uint8Array is not an 8 bit array, it's an array of unsigned 8 bit integers. It can have any length. A Uint8Array created from the same ArrayBuffer as a Uint16Array is going to be twice as long, because every byte in the ArrayBuffer is going to be "placed" as one element of the Uint8Array, while for the Uint16Array each pair of bytes is going to "become" one element in the array.
A good explanation of what happens is if we try thinking in binary. Try running this:
var buffer = new ArrayBuffer(2);
var uint8View = new Uint8Array(buffer);
var uint16View = new Uint16Array(buffer);
uint8View[0] = 2;
uint8View[1] = 1;
console.log(uint8View[0].toString(2));
console.log(uint8View[1].toString(2));
console.log(uint16View[0].toString(2));
The output is going to be
10
1
100000010
because displayed as an unsigned 8 bit integer in binary, 2 is 00000010 and 1 is 00000001. (toString strips leading zeroes).
Uint8Array represents an array of bytes. As I said, an element is an unsigned 8 bit integer. We just pushed two bytes to it.
In memory those two bytes are stored side by side as 00000001 00000010 (binary form again used to make things clearer).
Now when you initialize a Uint16Array over the same buffer it's going to contain the same bytes, but because an element is a unsigned 16 bit integer (two bytes), when you access uint16View[0] it's going to take the first two bytes and give them back to you. So 0000000100000010, which is 100000010 with no leading zeroes.
If you interpret this data as base 10 (decimal) integers you'll know it's 0000000100000010 to base 10 (258).
Neither Uint8Array nor Uint16Array store any data themselves. They are simply different ways of accessing bytes in an ArrayBuffer.
how one chooses which one to use? It's not based on preference but on the underlying data. ArrayBuffer is to be used when you receive some binary data from some external source (web socket maybe) and already know what the data represents. It might be a list of unsigned 8 bit integers, or one of signed 16 bit ones, or even a mixed list where you know the first element is an 8 bit integer and the next one is a 16 bit one. Then you can use DataView to read typed items from it.
If you don't know what the data represents you can't choose what to use.

javascript hex codes with TCP/IP communication

I’m using a node module ‘net’ to create a client application that sends data through a TCP socket. The server-side application accepts this message if it starts and ends with a correct hex code, just for example the data packet would start with a hex “0F” and ends with a hex “0F1C”. How would I create these hex codes with javascript ? I found this code to convert a UTF-8 string into a hex code, not sure if this is what I need as I don’t have much experience with TCP/IP socket connections. Heres some javascript I've used to convert a utf-8 to a hex code. But I'm not sure this is what I'm looking for? Does anyone have experience with TCP/IP transfers and/or javascript hex codes?.
function toHex(str,hex){
try{
hex = unescape(encodeURIComponent(str))
.split('').map(function(v){
return v.charCodeAt(0).toString(16)
}).join('')
}
catch(e){
hex = str
console.log('invalid text input: ' + str)
}
return hex
}
First of all, you do not need to convert your data string into hex values, in order to send it over TCP. Every string in node.js is converted to bytes when sent over the network.
Normally, you'd send over a string like so:
var data = "ABC";
socket.write(data); // will send bytes 65 66 67, or in hex: 44 45 46
Node.JS also allows you to pass Buffer objects to functions like .write().
So, probably the easiest way to achieve what you wish, is to create an appropriate buffer to hold your data.
var data = "ABC";
var prefix = 0x0F; // JavaScript allows hex numbers.
var suffix = 0x0FC1;
var dataSize = Buffer.byteLength(data);
// compute the required buffer length
var bufferSize = 1 + dataSize + 2;
var buffer = new Buffer(bufferSize);
// store first byte on index 0;
buffer.writeUInt8(prefix, 0);
// store string starting at index 1;
buffer.write(data, 1, dataSize);
// stores last two bytes, in big endian format for TCP/IP.
buffer.writeUInt16BE(suffix, bufferSize - 2);
socket.write(buffer);
Explanation:
The prefix hex value 0F requires 1 byte of space. The suffix hex value 0FC1 actually requires two bytes (a 16-bit integer).
When computing the number of required bytes for a string (JavaScript strings are UTF-16 encoded!), str.length is not accurate most of the times, especially when your string has non-ASCII characters in it. For this, the proper way of getting the byte size of a string is to use Buffer.byteLength().
Buffers in node.js have static allocations, meaning you can't resize them after you created them. Hence, you'll need to compute the size of the buffer -in bytes- before creating it. Looking at our data, that is 1 (for our prefix) + Buffer.byteLength(data) (for our data) + 2 (for our suffix).
After that -imagine buffers as arrays of bytes (8-bit values)-, we'll populate the buffer, like so:
write the first byte (the prefix) using writeUInt8(byte, offset) with offset 0 in our buffer.
write the data string, using .write(string[, offset[, length]][, encoding]), starting at offset 1 in our buffer, and length dataSize.
write the last two bytes, using .writeUInt16BE(value, offset) with offset bufferSize - 2. We're using writeUInt16BE to write the 16-bit value in big-endian encoding, which is what you'd need for TCP/IP.
Once we've filled our buffer with the correct data, we can send it over the network, using socket.write(buffer);
Additional tip:
If you really want to convert a large string to bytes, (e.g. to later print as hex), then Buffer is also great:
var buf = Buffer.from('a very large string');
// now you have a byte represetantion of the string.
Since bytes are all 0-255 decimal values, you can easily print them as hex values in console, like so:
for (i = 0; i < buf.length; i++) {
const byte = buf[i];
const hexChar = byte.toString(16); // convert the decimal `byte` to hex string;
// do something with hexChar, e.g. console.log(hexChar);
}

Buffer to integer. Having trouble understanding this line of code

I'm looking for help in understanding this line of code in the npm moudle hash-index.
The purpose of this module is to be a function which returns the sha-1 hash of an input mod by the second argument you pass.
The specific function in this module that I don't understand is this one that takes a Buffer as input and returns an integer:
var toNumber = function (buf) {
return buf.readUInt16BE(0) * 0xffffffff + buf.readUInt32BE(2)
}
I can't seem to figure out why those specific offsets of the buffer are chosen and what the purpose of multiplying by 0xffffffff is.
This module is really interesting to me and any help in understanding how it's converting buffers to integers would be greatly appreciated!
It prints the first UINT32 (Unsigned Integer 32 bits) in the buffer.
First, it reads the first two bytes (UINT16) of the buffer, using Big Endian, then, it multiplies it by 0xFFFFFFFF.
Then, it reads the second four bytes (UINT32) in the buffer, and adds it to the multiplied number - resulting in a number constructed from the first 6 bytes of the buffer.
Example: Consider [Buffer BB AA CC CC DD ... ]
0xbb * 0xffffffff = 0xbaffffff45
0xbaffffff45 + 0xaaccccdd = 0xbbaacccc22
And regarding the offsets, it chose that way:
First time, it reads from byte 0 to byte 1 (coverts to type - UINT16)
second time, it reads from byte 2 to byte 5 (converts to type - UINT32)
So to sum it up, it constructs a number from the first 6 bytes of the buffer using big endian notation, and returns it to the calling function.
Hope that's answers your question.
Wikipedia's Big Endian entry
EDIT
As someone pointed in the comments, I was totally wrong about 0xFFFFFFFF being a left-shift of 32, it's just a number multiplication - I'm assuming it's some kind of inner protocol to calculate a correct legal buffer header that complies with what they expect.
EDIT 2
After looking on the function in the original context, I've come to this conclusion:
This function is a part of a hashing flow, and it works in that manner:
Main flow receives a string input and a maximum number for the hash output, it then takes the string input, plugs it in the SHA-1 hashing function.
SHA-1 hashing returns a Buffer, it takes that Buffer, and applies the hash-indexing on it, as can be seen in the following code excerpt:
return toNumber(crypto.createHash('sha1').update(input).digest()) % max
Also, it uses a modulu to make sure the hash index returned doesn't exceed the maximum possible hash.
Multiplication by 2 is equivalent to a shift of bits to the left by 1, so the purpose of multiplying by 2^16 is the equivalent of shifting the bits left 16 times.
Here is a similar question already answered:
Bitwise Logic in C

Categories

Resources