I have this string of bytes represented in hex:
const s = "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\xff\x8bV23J15O4\xb14\xb1H61417KKLL\xb50L5U\x8a\x05\x00\xf6\xaa\x8e.\x1c\x00\x00\x00"
I would like to convert it to Uint8Array in order to further manipulate it.
How can it be done?
Update:
The binary string is coming from python backend. In python I can create this representation correctly:
encoded = base64.b64encode(b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\xff\x8bV23J15O4\xb14\xb1H61417KKLL\xb50L5U\x8a\x05\x00\xf6\xaa\x8e.\x1c\x00\x00\x00')
Since JavaScript strings support \x escapes, this should work to convert a Python byte string to a Uint8Array :
const s = "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\xff\x8bV23J15O4\xb14\xb1H61417KKLL\xb50L5U\x8a\x05\x00\xf6\xaa\x8e.\x1c\x00\x00\x00";
const array = Uint8Array.from([...s].map(v => v.charCodeAt(0)));
console.log(array);
In Node.js, one uses Buffer.from to convert a (base64-encoded) string into a Buffer.
If the original argument is a base64 encoded string, as in Python:
const buffer = Buffer.from(encodedString, 'base64');
It if's a UTF-8 encoded string:
const buffer = Buffer.from(encodedString);
Buffers are instances of Uint8Array, so they can be used wherever a Uint8Array is expected. Quoting from the docs:
The Buffer class is a subclass of JavaScript's Uint8Array class and extends it with methods that cover additional use cases. Node.js APIs accept plain Uint8Arrays wherever Buffers are supported as well.
const s = "\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\xff\x8bV23J15O4\xb14\xb1H61417KKLL\xb50L5U\x8a\x05\x00\xf6\xaa\x8e.\x1c\x00\x00\x00"
//btoa(base64) - transforms base64 to ascii
let str = btoa(s)
let encoder = new TextEncoder()
let typedarr = encoder.encode(str) //encode() returns Uint8Array
console.log(typedarr)
Related
Suppose I have a base64 encoded string and I want to convert it into an ArrayBuffer, I can do it in this way:
// base64 decode the string to get the binary data
const binaryString = window.atob(base64EncodedString);
// convert from a binary string to an ArrayBuffer
const buf = new ArrayBuffer(binaryString.length);
const bufView = new Uint8Array(buf);
for (let i = 0, strLen = binaryString.length; i < strLen; i++) {
bufView[i] = binaryString.charCodeAt(i);
}
// get ArrayBuffer: `buf`
From String.protoptype.charCodeAt(), it will return an integer between 0 and 65535 representing the UTF-16 code unit at the given index. But an Uint8Array's range value is [0, 255].
I was initially thinking that the code point we obtained from charCodeAt() could go out of the bound of the Uint8Array range. Then I checked the built-in atob() function, which returns an ASCII string containing decoded data. According to Binary Array, ASCII string has a range from 0 to 127, which is included in the range of Uint8Array, and that's why we are safe to use charCodeAt() in this case.
That's my understanding. I'm not sure if I interpret this correctly. Thanks for your help!
So looks like my understanding is correct.
Thanks to #Konrad, and here is his/her add-up:
charCodeAt is designed to support utf-16. And utf-16 was designed to be compatible with ASCII so the first 256 characters have exact values like in ASCII encoding.
const stringArray = ['0x00','0x3c','0xbc]
to
const array = [0x00,0x3c,0bc]
var buf = new Buffer.from(array)
How should I go about using the buffers in the string above as buffers?
You appear to have an array of strings where the strings are byte values written as hexadecimal strings. So you need to:
Convert each hex string to a byte; that's easily done with parseInt(str, 16) (the 16 being hexadecimal). parseInt will allow the 0x prefix. Or you could use +str or Number(str) since the prefix is there to tell them what number base to use. (More about various ways to convert strings to numbers in my answer here.)
Create a buffer and fill it in with the bytes.
If the array isn't massive and you can happily create a temporary array, use map and Buffer.from:
const buffer = Buffer.from(theArray.map(str => +str)));
If you want to avoid any unnecessary intermediate arrays, I'm surprised not to see any variant of Buffer.from that allows mapping, so we have to do those things separately:
const buffer = Buffer.alloc(theArray.length);
for (let index = 0; index < theArray.length; ++index) {
buffer[index] = +theArray[index];
}
There are many Q&A's about converting blobs or Uint8Array to base64. But I have been unable to find how to convert from 32-bit arrays to base64. Here is an attempt.
function p(msg) { console.log(msg) }
let wav1 = [0.1,0.2,0.3]
let wav = new Float32Array(wav1)
p(`Len array to encrypt=${wav.length}`)
let omsg = JSON.stringify({onset: { id: 'abc', cntr: 1234}, wav: atob(wav) })
p(omsg)
The atob gives:
Uncaught InvalidCharacterError: Failed to execute 'atob' on 'Window':
The string to be decoded is not correctly encoded."
What intermediate step is needed to allow proper encoding of the floats to base64 ? Note that I have also tried TweetNacl-util instead of atob this way:
nacl.util.encodeBase64(wav)
This results in the same error.
Update Using JSON.stringify directly converts each float element into its ascii equivalent - which bloats the datasize . For the above that is:
"0.10000000149011612,"1":0.20000000298023224,"2":0.30000001192092896
We are transferring large arrays so this is a suboptimal solution.
Update The crucial element of the solution in the accepted answer is using Float32Array(floats).buffer . I was unaware of the buffer attribute.
The problem with your current code is that nacl.util.encodeBase64() takes in either a string, Array, or Uint8Array. Since your input isn't an Array or Uint8Array, it assumes you want to pass it in as a string.
The solution, of course, is to encode it into a Uint8Array first, then encode the Uint8Array into base64. When you decode, first decode the base64 into a Uint8Array, then convert the Uint8Array back into your Float32Array. This can be done using JavaScript ArrayBuffer.
const floatSize = 4;
function floatArrayToBytes(floats) {
var output = floats.buffer; // Get the ArrayBuffer from the float array
return new Uint8Array(output); // Convert the ArrayBuffer to Uint8s.
}
function bytesToFloatArray(bytes) {
var output = bytes.buffer; // Get the ArrayBuffer from the Uint8Array.
return new Float32Array(output); // Convert the ArrayBuffer to floats.
}
var encoded = nacl.util.encodeBase64(floatArrayToBytes(wav)) // Encode
var decoded = bytesToFloatArray(nacl.util.decodeBase64(encoded)) // Decode
If you don't like functions, here's some one-liners!
var encoded = nacl.util.encodeBase64(new Uint8Array(wav.buffer)) // Encode
var decoded = new Float32Array(nacl.util.decodeBase64(encoded).buffer) // Decode
I'm trying to convert a string of 0 and 1 into the equivalent Buffer by parsing the character stream as a UTF-16 encoding.
For example:
var binary = "01010101010101000100010"
The result of that would be the following Buffer
<Buffer 55 54>
Please note Buffer.from(string, "binary") is not valid as it creates a buffer where each individual 0 or 1 is parsed as it's own Latin One-Byte encoded string. From the Node.js documentation:
'latin1': A way of encoding the Buffer into a one-byte encoded string (as defined by the IANA in RFC 1345, page 63, to be the Latin-1 supplement block and C0/C1 control codes).
'binary': Alias for 'latin1'.
Use "".match to find all the groups of 16 bits.
Convert the binary string to number using parseInt
Create an Uint16Array and convert it to a Buffer
Tested on node 10.x
function binaryStringToBuffer(string) {
const groups = string.match(/[01]{16}/g);
const numbers = groups.map(binary => parseInt(binary, 2))
return Buffer.from(new Uint16Array(numbers).buffer);
}
console.log(binaryStringToBuffer("01010101010101000100010"))
Like the title states, I am just trying to encode some bytes in a string, then decode them back to bytes. The conversion of a Uint8 array of bytes to string then back to array does not happen perfectly. I am just wondering what encoding I should use in the conversion to make it happen correctly.
I try this as a dummy example:
var bytes = serializeToBinary(); // giving me bytes
console.log('bytes type:'+ Object.prototype.toString.call(bytes));
console.log('bytes length:'+ bytes.length);
var bytesStr = bytes.toString('base64'); // gives me a string that looks like '45,80,114,98,97,68,111'
console.log('bytesStr length:'+ bytesStr.length);
console.log('bytesStr type:'+ Object.prototype.toString.call(bytesStr));
var decodedbytesStr = Buffer.from(bytesStr, 'base64');
console.log('decodedbytesStr type:'+ Object.prototype.toString.call(decodedbytesStr));
console.log('decodedbytesStr length:'+ decoded.length);
Output:
bytes type:[object Uint8Array]
bytes length:4235
bytesStr type:[object String]
bytesStr length:14161
decodedbytesStr type:[object Uint8Array]
decodedbytesStr length:7445
Shouldn't decodedbytesStr length and bytes length be the same?
TypedArray does not support .toString('base64'). The base64 argument is ignored, and you simply get a string representation of the array's values, separated by commas. This is not a base64 string, so Buffer.from(bytesStr, 'base64') is not processing it correctly.
You want to call .toString('base64') on a Buffer instead. When creating bytesStr, simply build a Buffer from your Uint8Array first:
var bytesStr = Buffer.from(bytes).toString('base64');