I'm trying to get exif data from JPG using JavaScript in the browser.
I'm using FileReader() class and readAsArrayBuffer() method.
For most operations I need Uint8Array, so that's what I'm casting ArrayBuffer to.
I've added function to objects of Uint8Array for when I need short:
const getShort = function(position, bigEndian = true) {
const int1 = this[position];
const int2 = this[position+1];
let result = (int1 << 8) | (int2 & 0xFF);
if(!bigEndian) {
let buffer = new ArrayBuffer(16);
let view = new DataView(buffer);
view.setInt16(1,result);
result = view.getInt16(1, true) ;
}
return(result);
}
The problem is when parsing 0110 1001 and 1000 0111 I'm getting 1000 0111 0110 1001 and that is interpreted as -30871 instead of 34665.
Related
I have found 3 methods to convert Uint8Array to BigInt and all of them give different results for some reason. Could you please tell me which one is correct and which one should I use?
Using bigint-conversion library. We can use bigintConversion.bufToBigint() function to get a BigInt. The implementation is as follows:
export function bufToBigint (buf: ArrayBuffer|TypedArray|Buffer): bigint {
let bits = 8n
if (ArrayBuffer.isView(buf)) bits = BigInt(buf.BYTES_PER_ELEMENT * 8)
else buf = new Uint8Array(buf)
let ret = 0n
for (const i of (buf as TypedArray|Buffer).values()) {
const bi = BigInt(i)
ret = (ret << bits) + bi
}
return ret
}
Using DataView:
let view = new DataView(arr.buffer, 0);
let result = view.getBigUint64(0, true);
Using a FOR loop:
let result = BigInt(0);
for (let i = arr.length - 1; i >= 0; i++) {
result = result * BigInt(256) + BigInt(arr[i]);
}
I'm honestly confused which one is right since all of them give different results but do give results.
I'm fine with either BE or LE but I'd just like to know why these 3 methods give a different result.
One reason for the different results is that they use different endianness.
Let's turn your snippets into a form where we can execute and compare them:
let source_array = new Uint8Array([
0xff, 0xee, 0xdd, 0xcc, 0xbb, 0xaa, 0x99, 0x88,
0x77, 0x66, 0x55, 0x44, 0x33, 0x22, 0x11]);
let buffer = source_array.buffer;
function method1(buf) {
let bits = 8n
if (ArrayBuffer.isView(buf)) {
bits = BigInt(buf.BYTES_PER_ELEMENT * 8)
} else {
buf = new Uint8Array(buf)
}
let ret = 0n
for (const i of buf.values()) {
const bi = BigInt(i)
ret = (ret << bits) + bi
}
return ret
}
function method2(buf) {
let view = new DataView(buf, 0);
return view.getBigUint64(0, true);
}
function method3(buf) {
let arr = new Uint8Array(buf);
let result = BigInt(0);
for (let i = arr.length - 1; i >= 0; i--) {
result = result * BigInt(256) + BigInt(arr[i]);
}
return result;
}
console.log(method1(buffer).toString(16));
console.log(method2(buffer).toString(16));
console.log(method3(buffer).toString(16));
Note that this includes a bug fix for method3: where you wrote for (let i = arr.length - 1; i >= 0; i++), you clearly meant i-- at the end.
For "method1" this prints: ffeeddccbbaa998877665544332211
Because method1 is a big-endian conversion (first byte of the array is most-significant part of the result) without size limit.
For "method2" this prints: 8899aabbccddeeff
Because method2 is a little-endian conversion (first byte of the array is least significant part of the result) limited to 64 bits.
If you switch the second getBigUint64 argument from true to false, you get big-endian behavior: ffeeddccbbaa9988.
To eliminate the size limitation, you'd have to add a loop: using getBigUint64 you can get 64-bit chunks, which you can assemble using shifts similar to method1 and method3.
For "method3" this prints: 112233445566778899aabbccddeeff
Because method3 is a little-endian conversion without size limit. If you reverse the for-loop's direction, you'll get the same big-endian behavior as method1: result * 256n gives the same value as result << 8n; the latter is a bit faster.
(Side note: BigInt(0) and BigInt(256) are needlessly verbose, just write 0n and 256n instead. Additional benefit: 123456789123456789n does what you'd expect, BigInt(123456789123456789) does not.)
So which method should you use? That depends on:
(1) Do your incoming arrays assume BE or LE encoding?
(2) Are your BigInts limited to 64 bits or arbitrarily large?
(3) Is this performance-critical code, or are all approaches "fast enough"?
Taking a step back: if you control both parts of the overall process (converting BigInts to Uint8Array, then transmitting/storing them, then converting back to BigInt), consider simply using hexadecimal strings instead: that'll be easier to code, easier to debug, and significantly faster. Something like:
function serialize(bigint) {
return "0x" + bigint.toString(16);
}
function deserialize(serialized_bigint) {
return BigInt(serialized_bigint);
}
If you need to store really big integers that isn't bound to any base64 or 128 and also keep negative numbers then this is a solution for you...
function encode(n) {
let hex, bytes
// shift all numbers 1 step to the left and xor if less then 0
n = (n << 1n) ^ (n < 0n ? -1n : 0n)
// convert to hex
hex = n.toString(16)
// pad if neccesseery
if (hex.length % 2) hex = '0' + hex
// convert hex to bytes
bytes = hex.match(/.{1,2}/g).map(byte => parseInt(byte, 16))
return bytes
}
function decode(bytes) {
let hex, n
// convert bytes back into hex
hex = bytes.map(e => e.toString(16).padStart(2, 0)).join('')
// Convert hex to BigInt
n = BigInt(`0x`+hex)
// Shift all numbers to right and xor if the first bit was signed
n = (n >> 1n) ^ (n & 1n ? -1n : 0n)
return n
}
const input = document.querySelector('input')
input.oninput = () => {
console.clear()
const bytes = encode(BigInt(input.value))
// TODO: Save or transmit this bytes
// new Uint8Array(bytes)
console.log(bytes.join(','))
const n = decode(bytes)
console.log(n.toString(10)+'n') // cuz SO can't render bigints...
}
input.oninput()
<input type="number" value="-39287498324798237498237498273323423" style="width: 100%">
I could verify an ECDSA / SHA256 signature using the standard library of Javascript (window.crypto.subtle.verify) but cannot using the jsrsasign library (KJUR.crypto). I have also tried 'KJUR.crypto.ECDSA' class directly but no luck neither.
See below both script methods which dont provide same result. Could someone advise the issue(s) ?
//function to convert HEX to Decimal - return Arraybuffer
function hexStringToUint8Array(hexString) {
if (hexString.length % 2 != 0)
throw "Invalid hexString";
var arrayBuffer = new Uint8Array(hexString.length / 2);
for (var i = 0; i < hexString.length; i += 2) {
var byteValue = parseInt(hexString.substr(i, 2), 16);
if (byteValue == NaN)
throw "Invalid hexString";
arrayBuffer[i / 2] = byteValue;
}
return arrayBuffer;
}
//function to convert Base64 to hex (8 bits formats)
function base64ToHex(str) {
const raw = atob(str);
let result = '';
for (let i = 0; i < raw.length; i++) {
const hex = raw.charCodeAt(i).toString(16);
result += (hex.length === 2 ? hex : '0' + hex);
}
return result;
}
//convert Base64 URL to Base64
function base64urlToBase64(base64url) {
base64url = base64url.toString();
return base64url
.replace(/\-/g, "+")
.replace(/_/g, "/");
}
//Define values
Base64URL_coordX = '2uYQAsY-bvzz7r7SL-tK2C0eySfYEf1blv91cnd_1G4';
Base64URL_coordY = 'S3j1vy2sbkExAYXumb3w1HMVH-4ztoHclVTwQd45Reg';
signature = 'ed0c2b2e56731511ce2cea1d7320cdbc39dbabca7f525ec5d646b7c11cb35d5846a1cb70c2a1d8480f5ef88b46d401ca78b18ccae9ae4e3934a6b8fe412f7b11';
dataHex = '48656c6c6f20386777696669'; // ='Hello 8gwifi'
////////////Verifying Method using standard javascript
var dataToVerify = hexStringToUint8Array(dataHex);
var SignatureToVerify = hexStringToUint8Array(signature);
window.crypto.subtle.importKey(
"jwk", //can be "jwk" (public or private), "spki" (public only), or "pkcs8" (private only)
{ //this is an example jwk key, other key types are Uint8Array objects
kty: "EC",
crv: "P-256",
x: Base64URL_coordX, // expects x and y to be «base64url» encoded
y: Base64URL_coordY,
ext: true,
},
{ //these are the algorithm options
name: "ECDSA",
namedCurve: "P-256", //can be "P-256", "P-384", or "P-521"
},
false, //whether the key is extractable (i.e. can be used in exportKey)
["verify"] //"verify" for public key import, "sign" for private key imports
)
.then(function(publicKey){
window.crypto.subtle.verify(
{
name: "ECDSA",
hash: {name: "SHA-256"}, //can be "SHA-1", "SHA-256", "SHA-384", or "SHA-512"
},
publicKey, //from generateKey or importKey above
SignatureToVerify, //ArrayBuffer of the signature
dataToVerify //ArrayBuffer of the data
)
.then(function(isvalid){
console.log('Signature valid1: ', isvalid);
})
.catch(function(err){
console.error(err);
});
});
////////////Verifying Method using KJUR
Hex_coordX = base64ToHex(base64urlToBase64(Base64URL_coordX));
Hex_coordY = base64ToHex(base64urlToBase64(Base64URL_coordY));
var XY = Hex_coordX.toString(16) + Hex_coordY.toString(16);
var sig = new KJUR.crypto.Signature({"alg": "SHA256withECDSA", "prov": "cryptojs/jsrsa"});
sig.init({xy: XY, curve: "secp256r1"});
sig.updateHex(dataHex);
var result = sig.verify(signature);
//Printing Verification
console.log('Signature valid2: ', result);
It says in the description of the library that it is JCA style. This probably means that the signature generation / verification functions have a ASN.1 / DER encoded input / output.
This consist of an ASN.1 SEQUENCE (tag 0x30), the length of the two integers inside. These two INTEGER's have tag 0x02 and a length of the size of the integer value of the r and s components of the signature. These are big endian, signed integers (which means stripping bytes if they are 0x00 or adding a 0x00 if the top byte is 0x80 or higher).
In your case that would be:
r = ed0c2b2e56731511ce2cea1d7320cdbc39dbabca7f525ec5d646b7c11cb35d58
s = 46a1cb70c2a1d8480f5ef88b46d401ca78b18ccae9ae4e3934a6b8fe412f7b11
Now converting these to DER ASN.1:
ri = 02 21 00 ed0c2b2e56731511ce2cea1d7320cdbc39dbabca7f525ec5d646b7c11cb35d58
si = 02 20 46a1cb70c2a1d8480f5ef88b46d401ca78b18ccae9ae4e3934a6b8fe412f7b11
and finally adding the sequence and adding the concatenation of above:
sig = 30 45 02 21 00 ed0c2b2e56731511ce2cea1d7320cdbc39dbabca7f525ec5d646b7c11cb35d58
02 20 46a1cb70c2a1d8480f5ef88b46d401ca78b18ccae9ae4e3934a6b8fe412f7b11
and checking the result e.g. here.
But I guess in your case just calling the function concatSigToASN1Sig would be faster :P
I need to add compression to my project and I decided to use the LZJB algorithm that is fast and the code is small. Found this library https://github.com/copy/jslzjb-k
But the API is not very nice because to decompress the file you need input buffer length (because Uint8Array is not dynamic you need to allocate some data). So I want to save the length of the input buffer as the first few bytes of Uint8Array so I can extract that value and create output Uint8Array based on that integer value.
I want the function that returns Uint8Array from integer to be generic, maybe save the length of the bytes into the first byte so you know how much data you need to extract to read the integer. I guess I need to extract those bytes and use some bit shifting to get the original number. But I'm not exactly sure how to do this.
So how can I write a generic function that converts an integer into Uint8Array that can be embedded into a bigger array and then extract that number?
Here are working functions (based on Converting javascript Integer to byte array and back)
function numberToBytes(number) {
// you can use constant number of bytes by using 8 or 4
const len = Math.ceil(Math.log2(number) / 8);
const byteArray = new Uint8Array(len);
for (let index = 0; index < byteArray.length; index++) {
const byte = number & 0xff;
byteArray[index] = byte;
number = (number - byte) / 256;
}
return byteArray;
}
function bytesToNumber(byteArray) {
let result = 0;
for (let i = byteArray.length - 1; i >= 0; i--) {
result = (result * 256) + byteArray[i];
}
return result;
}
by using const len = Math.ceil(Math.log2(number) / 8); the array have only bytes needed. If you want a fixed size you can use a constant 8 or 4.
In my case, I just saved the length of the bytes in the first byte.
General answer
These functions allow any integer (it uses BigInts internally, but can accept Number arguments) to be encoded into, and decoded from, any part of a Uint8Array. It is somewhat overkill, but I wanted to learn how to work with arbitrary-sized integers in JS.
// n can be a bigint or a number
// bs is an optional Uint8Array of sufficient size
// if unspecified, a large-enough Uint8Array will be allocated
// start (optional) is the offset
// where the length-prefixed number will be written
// returns the resulting Uint8Array
function writePrefixedNum(n, bs, start) {
start = start || 0;
let len = start+2; // start, length, and 1 byte min
for (let i=0x100n; i<n; i<<=8n, len ++) /* increment length */;
if (bs === undefined) {
bs = new Uint8Array(len);
} else if (bs.length < len) {
throw `byte array too small; ${bs.length} < ${len}`;
}
let r = BigInt(n);
for (let pos = start+1; pos < len; pos++) {
bs[pos] = Number(r & 0xffn);
r >>= 8n;
}
bs[start] = len-start-1; // write byte-count to start byte
return bs;
}
// bs must be a Uint8Array from where the number will be read
// start (optional, defaults to 0)
// is where the length-prefixed number can be found
// returns a bigint, which can be coerced to int using Number()
function readPrefixedNum(bs, start) {
start = start || 0;
let size = bs[start]; // read byte-count from start byte
let n = 0n;
if (bs.length < start+size) {
throw `byte array too small; ${bs.length} < ${start+size}`;
}
for (let pos = start+size; pos >= start+1; pos --) {
n <<= 8n;
n |= BigInt(bs[pos])
}
return n;
}
function test(n) {
const array = undefined;
const offset = 2;
let bs = writePrefixedNum(n, undefined, offset);
console.log(bs);
let result = readPrefixedNum(bs, offset);
console.log(n, result, "correct?", n == result)
}
test(0)
test(0x1020304050607080n)
test(0x0807060504030201n)
Simple 4-byte answer
This answer encodes 4-byte integers to and from Uint8Arrays.
function intToArray(i) {
return Uint8Array.of(
(i&0xff000000)>>24,
(i&0x00ff0000)>>16,
(i&0x0000ff00)>> 8,
(i&0x000000ff)>> 0);
}
function arrayToInt(bs, start) {
start = start || 0;
const bytes = bs.subarray(start, start+4);
let n = 0;
for (const byte of bytes.values()) {
n = (n<<8)|byte;
}
return n;
}
for (let v of [123, 123<<8, 123<<16, 123<<24]) {
let a = intToArray(v);
let r = arrayToInt(a, 0);
console.log(v, a, r);
}
Posting this one-liner in case it is useful to anyone who is looking to work with numbers below 2^53. This strictly uses bitwise operations and has no need for constants or values other than the input to be defined.
export const encodeUvarint = (n: number): Uint8Array => n >= 0x80
? Uint8Array.from([(n & 0x7f) | 0x80, ...encodeUvarint(n >> 7)])
: Uint8Array.from([n & 0xff]);
It seems like there is nothing to handle endianness when working with UInt8. For example, when dealing with UInt16, you can set if you want little or big endian:
dataview.setUint16(byteOffset, value [, littleEndian])
vs
dataview.setUint8(byteOffset, value)
I guess this is because endianness is dealing with the order of the bytes and if I'm inserting one byte at a time, then I need to order them myself.
So how do I go about handling endianness myself? I'm creating a WAVE file header using this spec: http://soundfile.sapp.org/doc/WaveFormat/
The first part of the header is "ChunkID" in big endian and this is how I do it:
dataView.setUint8(0, 'R'.charCodeAt());
dataView.setUint8(1, 'I'.charCodeAt());
dataView.setUint8(2, 'F'.charCodeAt());
dataView.setUint8(3, 'F'.charCodeAt());
The second part of the header is "ChunkSize" in small endian and this is how I do it:
dataView.setUint8(4, 172);
Now I suppose that since the endianness of those chunks is different then I should be doing something different in each chunk. What should I be doing different in those two instances?
Cheers!
EDIT
I'm asking this question, because the wav file I'm creating is invalid (according to https://indiehd.com/auxiliary/flac-validator/). I suspect this is because I'm not handling the endianness correctly. This is the full wave file:
const fs = require('fs');
const BITS_PER_BYTE = 8;
const BITS_PER_SAMPLE = 8;
const SAMPLE_RATE = 44100;
const NB_CHANNELS = 2;
const SUB_CHUNK_2_SIZE = 128;
const chunkSize = 36 + SUB_CHUNK_2_SIZE;
const blockAlign = NB_CHANNELS * (BITS_PER_SAMPLE / BITS_PER_BYTE);
const byteRate = SAMPLE_RATE * blockAlign;
const arrayBuffer = new ArrayBuffer(chunkSize + 8)
const dataView = new DataView(arrayBuffer);
// The RIFF chunk descriptor
// ChunkID
dataView.setUint8(0, 'R'.charCodeAt());
dataView.setUint8(1, 'I'.charCodeAt());
dataView.setUint8(2, 'F'.charCodeAt());
dataView.setUint8(3, 'F'.charCodeAt());
// ChunkSize
dataView.setUint8(4, chunkSize);
// Format
dataView.setUint8(8, 'W'.charCodeAt());
dataView.setUint8(9, 'A'.charCodeAt());
dataView.setUint8(10, 'V'.charCodeAt());
dataView.setUint8(11, 'E'.charCodeAt());
// The fmt sub-chunk
// Subchunk1ID
dataView.setUint8(12, 'f'.charCodeAt());
dataView.setUint8(13, 'm'.charCodeAt());
dataView.setUint8(14, 't'.charCodeAt());
// Subchunk1Size
dataView.setUint8(16, 16);
// AudioFormat
dataView.setUint8(20, 1);
// NumChannels
dataView.setUint8(22, NB_CHANNELS);
// SampleRate
dataView.setUint8(24, ((SAMPLE_RATE >> 8) & 255));
dataView.setUint8(25, SAMPLE_RATE & 255);
// ByteRate
dataView.setUint8(28, ((byteRate >> 8) & 255));
dataView.setUint8(29, byteRate & 255);
// BlockAlign
dataView.setUint8(32, blockAlign);
// BitsPerSample
dataView.setUint8(34, BITS_PER_SAMPLE);
// The data sub-chunk
// Subchunk2ID
dataView.setUint8(36, 'd'.charCodeAt());
dataView.setUint8(37, 'a'.charCodeAt());
dataView.setUint8(38, 't'.charCodeAt());
dataView.setUint8(39, 'a'.charCodeAt());
// Subchunk2Size
dataView.setUint8(40, SUB_CHUNK_2_SIZE);
// Data
for (let i = 0; i < SUB_CHUNK_2_SIZE; i++) {
dataView.setUint8(i + 44, i);
}
A single byte (uint8) doesn't have any endianness, endianness is a property of a sequence of bytes.
According to the spec you linked, the ChunkSize takes space for 4 bytes - with the least significant byte first (little endian). If your value is only one byte (not larger than 255), you would just write the byte at offset 4 as you did. If the 4 bytes were in big endian order, you'd have to write your byte at offset 7.
I would however recommend to simply use setUint32:
dataView.setUint32(0, 0x52494646, false); // RIFF
dataView.setUint32(4, 172 , true);
dataView.setUint32(8, 0x57415645, false) // WAVE
I'm learning about Blockchain and wanted to create an example of creating an address, purely for educational purposes - WOULD NOT BE DONE ANYWHERE NEAR PRODUCTION.
Task: create 160 random bits, convert it to hex, convert that to base 58, then to test correctness by reversing the process.
It kind of works, however I get intermittent 'false' on comparison of before and after binary. The hexStringToBinary function returns strings with varying lengths:
const bs58 = require('bs58');
//20 * 8 = 160
function generate20Bytes () {
let byteArray = [];
let bytes = 0;
while (bytes < 20) {
let byte = '';
while (byte.length < 8) {
byte += Math.floor(Math.random() * 2);
}
byteArray.push(byte);
bytes++;
}
return byteArray;
}
//the issue is probably from here
function hexStringToBinary (string) {
return string.match(/.{1,2}/g)
.map(hex => parseInt(hex, 16).toString(2).padStart(8, '0'));
}
const addressArray = generate20Bytes();
const binaryAddress = addressArray.join('');
const hex = addressArray.map(byte => parseInt(byte, 2).toString(16)).join('');
console.log(hex);
// then lets convert it to base 58
const base58 = bs58.encode(Buffer.from(hex));
console.log('base 58');
console.log(base58);
// lets see if we can reverse the process
const destructuredHex = bs58.decode(base58).toString();
console.log('hex is the same');
console.log(hex === destructuredHex);
// lets convert back to a binary string
const destructuredAddress = hexStringToBinary(destructuredHex).join('');
console.log('destructured address');
console.log(destructuredAddress);
console.log('binaryAddress address');
console.log(binaryAddress);
//intermittent false/true
console.log(destructuredAddress === binaryAddress);
Got round to refactoring with tdd. Realised it wasn't zero filling hex < 16. My playground repo