Unpack BinaryString sent from JavaScript FileReader API to Python - javascript

I'm trying to unpack a binary string sent via Javascript's FileReader readAsBinaryString method in my python app. It seems I could use the struct module for this. I'm unsure what to provide as as the format for the unpack exactly.
Can someone confirm this is the right approach, and if so, what format I should specify?
According to the JS documentation:
The result will contain the file's data
as a binary string. Every byte is
represented by an integer in the range
[0..255].

It sounds as if you just have an ordinary string (or bytes object in Python 3), so I'm not sure what you need to unpack.
One method of accessing the byte data is to use a bytearray; this lets you index the byte data easily:
>>> your_data = b'\x00\x12abc'
>>> b = bytearray(your_data)
>>> b[0]
0
>>> b[1]
18
If you have it as a string and don't want to use a bytearray (which need Python 2.6 or later) then use ord to convert the character to an integer.
>>> ord(your_data[1])
18
If your binary data has a particular interpretation in terms of groups of bytes representing integers or floats with particular endianness then the struct module is certainly your friend, but you don't need it just to examine the byte data.

Related

output INT64 from js UDF

I'm trying to use BigQuery's INT64 type to hold bit encoded information. I have to use a javascript udf function and I'd like to use all the 64 bits.
My issue is that javascript only deals with int32 so 1 << 32 == 1 and I'm not sure how to use the full 64 range that BigQuery supports in the udf.
It’s not possible to directly convert Big Query’s INT64 type to JavaScript UDF, neither as input nor output, as JavaScript does not support 64-bit integer type [1]. You could use FLOAT64 instead, as far as the values are less than 2^53 - 1, since it follows the IEEE 754-2008 standard for double precision [2]. You can also use a string containing the number value. Here is the documentation for supported external UDF data types [3].

Convert to string nodejs

0xc4115 0x4cf8
Im not sure what data type this is so my question would be:
What data type is this and how can I convert it to something more manageable using NODE.JS?
you can convert Hexadecimal to Decimal by this
let hex_num = "0xc4115";
console.log(Number(hex_num)); #803093
In general you have the format 0x for hexadecimal, 0b for binary and 0 for octal. All this represent Numbers. JavaScript converts to decimal all of this types automatically. In case you want to do it yourself, you can use parseInt(number,base).

How do you decide which typed array to use?

I am trying to create a view of ArrayBuffer object in order to JSONify it.
var data = { data: new Uint8Array(arrayBuffer) }
var json = JSON.stringify(data)
It seems that the size of the ArrayBuffer does not matter even with the smallest Uint8Array. I did not get any RangeError so far:) If so, how do I decide which typed array to use?
You decide based on the data stored in the buffer, or, better said, based on your interpretation of that data.
Also an Uint8Array is not an 8 bit array, it's an array of unsigned 8 bit integers. It can have any length. A Uint8Array created from the same ArrayBuffer as a Uint16Array is going to be twice as long, because every byte in the ArrayBuffer is going to be "placed" as one element of the Uint8Array, while for the Uint16Array each pair of bytes is going to "become" one element in the array.
A good explanation of what happens is if we try thinking in binary. Try running this:
var buffer = new ArrayBuffer(2);
var uint8View = new Uint8Array(buffer);
var uint16View = new Uint16Array(buffer);
uint8View[0] = 2;
uint8View[1] = 1;
console.log(uint8View[0].toString(2));
console.log(uint8View[1].toString(2));
console.log(uint16View[0].toString(2));
The output is going to be
10
1
100000010
because displayed as an unsigned 8 bit integer in binary, 2 is 00000010 and 1 is 00000001. (toString strips leading zeroes).
Uint8Array represents an array of bytes. As I said, an element is an unsigned 8 bit integer. We just pushed two bytes to it.
In memory those two bytes are stored side by side as 00000001 00000010 (binary form again used to make things clearer).
Now when you initialize a Uint16Array over the same buffer it's going to contain the same bytes, but because an element is a unsigned 16 bit integer (two bytes), when you access uint16View[0] it's going to take the first two bytes and give them back to you. So 0000000100000010, which is 100000010 with no leading zeroes.
If you interpret this data as base 10 (decimal) integers you'll know it's 0000000100000010 to base 10 (258).
Neither Uint8Array nor Uint16Array store any data themselves. They are simply different ways of accessing bytes in an ArrayBuffer.
how one chooses which one to use? It's not based on preference but on the underlying data. ArrayBuffer is to be used when you receive some binary data from some external source (web socket maybe) and already know what the data represents. It might be a list of unsigned 8 bit integers, or one of signed 16 bit ones, or even a mixed list where you know the first element is an 8 bit integer and the next one is a 16 bit one. Then you can use DataView to read typed items from it.
If you don't know what the data represents you can't choose what to use.

How to put and get group of unknown bytes as a string to JSON?

i have unknown bytes its bits start from 00000000 to 11111111
i use every 8 bits as a character, how to add these characters to JSON without error and get them back?
i search a lot but cannot find answer.using java and java script .
You have several options.
Assuming you're starting with an array of numbers, the simplest method is to just convert to JSON directly. JSON does support arrays of numbers after all.
bytes = Array.apply(null, {length:256}).map(function(a, b) {return b;})
JSON.stringify(bytes)
"[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255]"
JSON.parse(JSON.stringify(bytes))
Array[256]
If you really want to convert it to a string for some reason, Javascript strings are unicode, so they can handle code points 0-255 losslessly.
s = bytes.map(function(x) {return String.fromCharCode(x);}).join('')
"
bytes2 = s.split('').map(function(x) {return x.charCodeAt(0);})
Array[256]
JSON also supports strings, so you can convert the string to and from JSON if you want to, though I can't imagine why you would.
s
"
 !"#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ"
JSON.stringify(s)
""\u0000\u0001\u0002\u0003\u0004\u0005\u0006\u0007\b\t\n\u000b\f\r\u000e\u000f\u0010\u0011\u0012\u0013\u0014\u0015\u0016\u0017\u0018\u0019\u001a\u001b\u001c\u001d\u001e\u001f !\"#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ""
JSON.parse(JSON.stringify(s))
"

JavaScript-friendly binary-safe data format design (not JSON or XML)

First and foremost: JSON and XML are not an option in this specific case, please don't suggest them. If this makes it easier to accept that fact, imagine that I intend to reinvent the wheel for self-education.
Back to the point:
I need to design a binary-safe data format to encode some datagrams I send to a particular dumb server that I write (in C if that matters).
To simplify the question, let's say that I'm sending only numbers, strings and arrays.
Important fact: Server does not (and should not) know anything about Unicode and stuff. It treats all strings as binary blobs (and never looks inside them).
The format that I originally devised is as follows:
Datagram: <Number:size>\n<Value1>...<ValueN>
Value:
Number: N\n<Value>\n
String: S\n<Number:size-in-bytes>\n<bytes>\n
Array: A\n<Number:size>\n<Value0>...<ValueN>
Example:
[ 1, "foo", [] ]
Serializes as follows:
1 ; number of items in datagram
A ; -- array --
3 ; number of items in array
N ; -- number --
1 ; number value
S ; -- string --
3 ; string size in bytes
foo ; string bytes
A ; -- array --
0 ; number of items in array
The problem is that I can not reliably get a string size in bytes in JavaScript.
So, the question is: how to change the format, so a string can be both saved in JS and loaded in C neatly.
I do not want to add Unicode support to the server.
And I do not quite want to decode strings on server (say, from base64 or simply to unescape \xNN sequences) — this would require work with dynamic string buffers, which, given how dumb the server is, is not so desirable...
Any clues?
It seems that reading UTF-8 in plain C is not that scary after all. So I'm extending the protocol to handle UTF-8 strings natively. (But will appreciate an answer to this question as it stands.)

Categories

Resources