Using DataView with nodejs Buffer - javascript

I'm trying to read/write some binary data using the DataView object. It seems to work correctly when the buffer is initialized from a UInt8Array, however it if pass it a nodejs Buffer object the results seem to be off. Am I using the API incorrectly?
import { expect } from 'chai';
describe('read binary data', () => {
it('test buffer', () => {
let arr = Buffer.from([0x55,0xaa,0x55,0xaa,0x60,0x00,0x00,0x00,0xd4,0x03,0x00,0x00,0x1c,0xd0,0xbb,0xd3,0x00,0x00,0x00,0x00])
let read = new DataView(arr.buffer).getUint32(0, false);
expect(read).to.eq(0x55aa55aa);
})
it('test uint8array', () => {
let arr = new Uint8Array([0x55,0xaa,0x55,0xaa,0x60,0x00,0x00,0x00,0xd4,0x03,0x00,0x00,0x1c,0xd0,0xbb,0xd3,0x00,0x00,0x00,0x00])
let read = new DataView(arr.buffer).getUint32(0, false);
expect(read).to.eq(0x55aa55aa);
})
})
The one with the buffer fails with
AssertionError: expected 1768779887 to equal 1437226410
+ expected - actual
-1768779887
+1437226410

try use this buf.copy
const buf = fs.readFileSync(`...`);
const uint8arr = new Uint8Array(buf.byteLength);
buf.copy(uint8arr, 0, 0, buf.byteLength);
const v = new DataView(uint8arr.buffer);

Nodejs Buffer is just a view over underlying allocated buffer that can be a lot larger. This is how to get ArrayBuffer out of Buffer:
function getArrayBufferFromBuffer( buffer ) {
return buffer.buffer.slice( buffer.byteOffset, buffer.byteOffset + buffer.byteLength ) );
}

This helped (not sure if this is the most elegant):
const buff = Buffer.from(msgBody, 'base64');
let uint8Array = new Uint8Array(buff.length);
for(let counter=0;counter<buff.length;counter++) {
uint8Array[counter] = buff[counter];
//console.debug(`uint8Array[${counter}]=${uint8Array[counter]}`);
}
let dataview = new DataView(uint8Array.buffer, 0, uint8Array.length);

Related

Getting a value from within FileReader.onloadend (extracting first line of a csv without reading the whole file)

I'm trying to extract the first line (headers) from a CSV using TypeScript. I found a nifty function that does this using FileReader.onloadend, iterating over the bytes in the file until it reaches a line break. This function assigns the the header string to a window namespace. This is unfortunately not that useful to me in a window namespace, but I can't find a workable way of getting the header string assigned to a global variable. Does anyone know how best to do this? Is this achievable with this function?
The function is here:
declare global {
interface Window { MyNamespace: any; }
}
export const CSVImportGetHeaders = async (file: File) => {
const reader = new FileReader();
reader.readAsArrayBuffer(file);
// onload triggered each time the reading operation is completed
reader.onloadend = (evt: any) => {
// get array buffer
const data = evt.target.result;
console.log('reader content ', reader);
// get byte length
const byteLength = data.byteLength;
console.log('HEADER STRING ', byteLength);
// make iterable array
const ui8a = new Uint8Array(data, 0);
// header string, compiled iterably
let headerString = '';
let finalIndex = 0;
// eslint-disable-next-line no-plusplus
for (let i = 0; i < byteLength; i++) {
// get current character
const char = String.fromCharCode(ui8a[i]);
// check if new line
if (char.match(/[^\r\n]+/g) !== null) {
// if not a new line, continue
headerString += char;
} else {
// if new lineBreak, stop
finalIndex = i;
break;
}
}
window.MyNamespace = headerString.split(/,|;/);
const potout = window.MyNamespace;
console.log('reader result in function', potout);
};
const output = await window.MyNamespace;
console.log('outside onload event', output);
};

How to save object with large binary data and other values?

I am currently trying to save an js object with some binary data and other values. The result should look something like this:
{
"value":"xyz",
"file1":"[FileContent]",
"file2":"[LargeFileContent]"
}
Till now I had no binary data so I saved everything in JSON. With the binary data I am starting to run into problems with large files (>1GB).
I tried this approach:
JSON.stringify or how to serialize binary data as base64 encoded JSON?
Which worked for smaller files with around 20MB. However if I am using these large files then the result of the FileReader is always an empty string.
The result would look like this:
{
"value":"xyz:,
"file1":"[FileContent]",
"file2":""
}
The code that is reading the blobs is pretty similar to the one in the other post:
const readFiles = async (measurements: FormData) => {
setFiles([]); //This is where the result is beeing stored
let promises: Array<Promise<string>> = [];
measurements.forEach((value) => {
let dataBlob = value as Blob;
console.log(dataBlob); //Everything is fine here
promises.push(
new Promise((resolve, reject) => {
const reader = new FileReader();
reader.readAsDataURL(dataBlob);
reader.onloadend = function () {
resolve(reader.result as string);
};
reader.onerror = function (error) {
reject(error);
};
})
);
});
let result = await Promise.all(promises);
console.log(result); //large file shows empty
setFiles(result);
};
Is there something else I can try?
Since you have to share the data with other computers, you will have to generate your own binary format.
Obviously you can make it as you wish, but given your simple case of just storing Blob objects with a JSON string, we can come up with a very simple schema where we first store some metadata about the Blobs we store, and then the JSON string where we replaced each Blob with an UUID.
This works because the limitation you hit is actually on the max length a string can be, and we can .slice() our binary file to read only part of it. Since we never read the binary data as string we're fine, the JSON will only hold a UUID in places where we had Blobs and it shouldn't grow too much.
Here is one such implementation I made quickly as a proof of concept:
/*
* Stores JSON data along with Blob objects in a binary file.
* Schema:
* 4 first bytes = # of blobs stored in the file
* next 4 * # of blobs = size of each Blob
* remaining = JSON string
*
*/
const hopefully_unique_id = "_blob_"; // <-- change that
function generateBinary(JSObject) {
let blobIndex = 0;
const blobsMap = new Map();
const JSONString = JSON.stringify(JSObject, (key, value) => {
if (value instanceof Blob) {
if (blobsMap.has(value)) {
return blobsMap.get(value);
}
blobsMap.set(value, hopefully_unique_id + (blobIndex++));
return hopefully_unique_id + blobIndex;
}
return value;
});
const blobsArr = [...blobsMap.keys()];
const data = [
new Uint32Array([blobsArr.length]),
...blobsArr.map((blob) => new Uint32Array([blob.size])),
...blobsArr,
JSONString
];
return new Blob(data);
}
async function readBinary(bin) {
const numberOfBlobs = new Uint32Array(await bin.slice(0, 4).arrayBuffer())[0];
let cursor = 4 * (numberOfBlobs + 1);
const blobSizes = new Uint32Array(await bin.slice(4, cursor).arrayBuffer())
const blobs = [];
for (let i = 0; i < numberOfBlobs; i++) {
const blobSize = blobSizes[i];
blobs.push(bin.slice(cursor, cursor += blobSize));
}
const pattern = new RegExp(`^${hopefully_unique_id}\\d+$`);
const JSObject = JSON.parse(
await bin.slice(cursor).text(),
(key, value) => {
if (typeof value !== "string" || !pattern.test(value)) {
return value;
}
const index = +value.replace(hopefully_unique_id, "") - 1;
return blobs[index];
}
);
return JSObject;
}
// demo usage
(async () => {
const obj = {
foo: "bar",
file1: new Blob(["Let's pretend I'm actually binary data"]),
// This one is 512MiB, which is bigger than the max string size in Chrome
// i.e it can't be stored in a JSON string in Chrome
file2: new Blob([Uint8Array.from({ length: 512*1024*1024 }, () => 255)]),
};
const bin = generateBinary(obj);
console.log("as binary", bin);
const back = await readBinary(bin);
console.log({back});
console.log("file1 read as text:", await back.file1.text());
})().catch(console.error);

Async JS validation issues for html textarea

I'm trying to replicate the code in this article:
https://depth-first.com/articles/2020/08/24/smiles-validation-in-the-browser/
What I'm trying to do different is that I'm using a textarea instead of input to take multi-line input. In addition to displaying an error message, I also want to display the entry which doesn't pass the validation.
The original validation script is this:
const path = '/target/wasm32-unknown-unknown/release/smival.wasm';
const read_smiles = instance => {
return smiles => {
const encoder = new TextEncoder();
const encoded = encoder.encode(`${smiles}\0`);
const length = encoded.length;
const pString = instance.exports.alloc(length);
const view = new Uint8Array(
instance.exports.memory.buffer, pString, length
);
view.set(encoded);
return instance.exports.read_smiles(pString);
};
};
const watch = instance => {
const read = read_smiles(instance);
document.querySelector('input').addEventListener('input', e => {
const { target } = e;
if (read(target.value) === 0) {
target.classList.remove('invalid');
} else {
target.classList.add('invalid');
}
});
}
(async () => {
const response = await fetch(path);
const bytes = await response.arrayBuffer();
const wasm = await WebAssembly.instantiate(bytes, { });
watch(wasm.instance);
})();
For working with a textarea, I've changed the watch function to this and added a <p id="indicator"> element to the html to display an error:
const watch = instance => {
const read = read_smiles(instance);
document.querySelector("textarea").addEventListener('input', e => {
const { target } = e;
var lines_array = target.value.split('/n');
var p = document.getElementById("indicator");
p.style.display = "block";
p.innerHTML = "The size of the input is : " + lines_array.length;
if (read(target.value) === 0) {
target.classList.remove('invalid');
} else {
target.classList.add('invalid');
}
});
}
I'm not even able to get a count of entries that fail the validation. I believe this is async js and I'm just a beginner in JavaScript so it's hard to follow what is happening here, especially the part where the function e is referencing itself.
document.querySelector("textarea").addEventListener('input', e => {
const { target } = e;
Can someone please help me in understanding this complicated code and figuring out how to get a count of entries that fail the validation and also printing the string/index of the same for helping the user?
There is a mistake in you code to count entries in the textarea:
var lines_array = target.value.split('\n'); // replace /n with \n
You are asking about the function e is referencing itself:
The destructuring assignment syntax is a JavaScript expression that makes it possible to unpack values from arrays, or properties from objects, into distinct variables. You can find more informations Mdn web docs - Destructuring object

Best way to make file buffer accessible outside a callback function in Javascript for large files?

I am new to JS, and I need to load a file1, decompress a part of it to file2, and then make that decompressed file2 available to user's download--all completely browser-side (no Node.js etc.).
For decompression I have:
let fb;
const decB = document.querySelector('button[id="dec"]')
const inputB = document.querySelector('input[type="file"]')
input.addEventListener('change', function(e) {
const r = new FileReader()
r.onload = function () {
const archive = new Uint8Array(r.result, start, length)
try {
fb = pako.inflate(archive);
} catch (err) {
console.log(err);
}
}
r.readAsArrayBuffer(input.files[0])
}, false)
decB.addEventListener("click", function(e) {
try {
const t = new TextDecoder().decode(fb)
console.log(t)
} catch(err) {
console.log(err)
}
}, false)
I want to be able to access the contents of the result in other functions. Is using a global variable the best way to do it, or is there a more proper solution?
Here is a tiny dependency free variant
function decompressBlob(blob) {
const ds = new DecompressionStream('gzip');
const decompressedStream = blob.stream().pipeThrough(ds);
return new Response(decompressedStream).blob();
}
function compressBlob(blob) {
const ds = new CompressionStream('gzip');
const decompressedStream = blob.stream().pipeThrough(ds);
return new Response(decompressedStream).blob();
}
const file = new File(['abc'.repeat(100)], 'filename.txt')
console.log('original file size', file.size)
compressBlob(file).then(async newBlob => {
console.log('compressed blob size:', newBlob.size)
const decompressedBlob = await decompressBlob(newBlob)
const content1 = await decompressedBlob.text()
const content2 = await file.text()
const expected = 'abc'.repeat(100)
console.log('same content:', content1 === expected)
console.log('same content:', content2 === expected)
})
Then if you want to download it create a object url and attach it to a link with a download attribute
a = document.createElement('a')
a.href = Object.createObjectURL(blob)
a.download = originalFile.name + '.gz'
a.click()
I guess if you want to avoid callbacks and not having a giant code block, then you can try to use async/await instead along with the new promise based reading methods on the blob itself (https://developer.mozilla.org/en-US/docs/Web/API/Blob/arrayBuffer)
input.addEventListener('change', async evt => {
const [file] = input.files
if (file) {
const arrayBuffer = await file.slice(28, 7412).arrayBuffer()
const compressed = new Uint8Array(arrayBuffer)
fileBuffer = pako.inflate(compressed)
document.getElementById('Decompress').disabled = false
} else {
// input was cleared
}
})

Get ReadableStream from Webcam in Browser

I would like to get webcam input as a ReadableStream in the browser to pipe to a WritableStream. I have tried using the MediaRecorder API, but that stream is chunked into separate blobs while I would like one continuous stream. I'm thinking the solution might be to pipe the MediaRecorder chunks to a unified buffer and read from that as a continuous stream, but I'm not sure how to get that intermediate buffer working.
mediaRecorder = new MediaRecorder(stream, recorderOptions);
mediaRecorder.ondataavailable = handleDataAvailable;
mediaRecorder.start(1000);
async function handleDataAvailable(event) {
if (event.data.size > 0) {
const data: Blob = event.data;
// I think I need to pipe to an intermediate stream? Not sure how tho
data.stream().pipeTo(writable);
}
}
Currently we can't really access the raw data of the MediaStream, the closest we have for video is the MediaRecorder API but this will encode the data and works by chunks not as a stream.
However, there is a new MediaCapture Transform W3C group working on a MediaStreamTrackProcessor interface doing exactly what you want and which is already available in Chrome under the chrome://flags/#enable-experimental-web-platform-features flag.
When reading the resulting stream and depending on which kind of track you passed, you'll gain access to VideoFrames or AudioFrames which are being added by the new WebCodecs API.
if( window.MediaStreamTrackProcessor ) {
const track = getCanvasTrack();
const processor = new MediaStreamTrackProcessor( track );
const reader = processor.readable.getReader();
readChunk();
function readChunk() {
reader.read().then( ({ done, value }) => {
// value is a VideoFrame
// we can read the data in each of its planes into an ArrayBufferView
const channels = value.planes.map( (plane) => {
const arr = new Uint8Array(plane.length);
plane.readInto(arr);
return arr;
});
value.close(); // close the VideoFrame when we're done with it
log.textContent = "planes data (15 first values):\n" +
channels.map( (arr) => JSON.stringify( [...arr.subarray(0,15)] ) ).join("\n");
if( !done ) {
readChunk();
}
});
}
}
else {
console.error("your browser doesn't support this API yet");
}
function getCanvasTrack() {
// just some noise...
const canvas = document.getElementById("canvas");
const ctx = canvas.getContext("2d");
const img = new ImageData(300, 150);
const data = new Uint32Array(img.data.buffer);
const track = canvas.captureStream().getVideoTracks()[0];
anim();
return track;
function anim() {
for( let i=0; i<data.length;i++ ) {
data[i] = Math.random() * 0xFFFFFF + 0xFF000000;
}
ctx.putImageData(img, 0, 0);
if( track.readyState === "live" ) {
requestAnimationFrame(anim);
}
}
}
<pre id="log"></pre>
<p>
Source<br>
<canvas id="canvas"></canvas>
</p>

Categories

Resources