JS - How to compute MD5 on binary data - javascript

EDIT: changed title from "JS File API - write and read UTF-8 data is inconsistent" to reflect the actual question.
I have some binary content i need to calculate the MD5 of. The content is a WARC file, that means that it holds text as well as encoded images. To avoid errors in the file saving, I convert and store all the data in arrayBuffers. All the data is put in UInt8Arrays to convert it to UTF-8.
My first attempt, for testing, is to use the saveAs library to save files from Chrome extensions. This means I was using a blob object to be passed on to the method and create the file.
var b = new Blob(arrayBuffers, {type: "text/plain;charset=utf-8"});
saveAs(b,'name.warc');
I haven't found a tool to compute the MD5 from a Blob object so what I was doing was using a FileReader to read the blob file as binary data and then use an MD5 tool (I used cryptoJS as well as a tool from faultylabs) to compute the result.
f = new FileReader();
f.readAsBinaryString(b);
f.onloadend = function(a){
console.log( 'Original file checksum: ', faultylabs.MD5(this.result) );
}
The resources (images) are downloaded directly in arraybuffer format so I have no need to convert them.
The result was wrong, meaning that checking the MD5 from the code and checking it from the file I saved on my local machine gave 2 different results. Reading as text, obviously shoots out an error.
The workaround I found, consists in writing the blob object on the disk using the filesystem API and then read it back as binary data, compute the MD5 and then save that retrieved file as WARC file (not directly the blob object but this "refreshed" version of the file).
In this case the computed MD5 is fine ( I calculate it on the "refreshed" version of the warc file) but when I launch the WARC replay instance with the "refreshed" warc archive, it throws me errors - while with the original file I don't have any problem (but the MD5 is not correct).
var fd = new FormData();
// To compute the md5 hash and to have it correct on the server side, we need to write the file to the system, read it back and then calculate the md5 value.
// We need to send this version of the warc file to the server as well.
window.requestFileSystem = window.requestFileSystem || window.webkitRequestFileSystem;
function computeWARC_MD5(callback,formData) {
window.requestFileSystem(window.TEMPORARY, b.size, onInitFs);
function onInitFs(fs) {
fs.root.getFile('warc.warc', {create: true}, function(fileEntry) {
fileEntry.createWriter(function(fileWriter) {
fileWriter.onwriteend = function(e) {
readAndMD5();
};
fileWriter.onerror = function(e) {
console.error('Write failed: ' + e.toString());
};
fileWriter.write(b);
});
});
function readAndMD5() {
fs.root.getFile('warc.warc', {}, function(fileEntry) {
fileEntry.file( function(file) {
var reader = new FileReader();
reader.onloadend = function(e) {
var warcMD5 = faultylabs.MD5( this.result );
console.log(warcMD5);
var g = new Blob([this.result],{type: "text/plain;charset=utf-8"});
saveAs(g, o_request.file);
formData.append('warc_file', g)
formData.append('warc_checksum_md5', warcMD5.toLowerCase());
callback(formData);
};
reader.readAsBinaryString(file);
});
});
}
}
}
function uploadData(formData) {
// upload
$.ajax({
type: 'POST',
url: server_URL_upload,
data: fd,
processData: false,
contentType: false,
// [SPECS] fire a progress event named progress at the XMLHttpRequestUpload object about every 50ms or for every byte transmitted, whichever is least frequent
xhrFields: {
onprogress: function (e) {
if (e.lengthComputable) {
console.log(e.loaded / e.total * 100 + '%');
}
}
}
}).done(function(data) {
console.log('done uploading!');
//displayMessage(port_to_page, 'Upload finished!', 'normal')
//port_to_page.postMessage( { method:"doneUpload" } );
});
}
computeWARC_MD5(uploadData, fd);
saveAs(b, 'warc.warc');
Could anybody explain me why there is this discrepancy? What am I missing in treating all the objects I am dealing with as binary data (store, read)?

Basically I tried another route and converted the blob file back to arraybuffer and computed the MD5 on that. At that point, the file's MD5 and the arraybuffer's are the same.
var b = new Blob(arrayBuffers, {type: "text/plain;charset=utf-8"});
var blobHtml = new Blob( [str2ab(o_request.main_page_html)], {type: "text/plain;charset=utf-8"} );
f = new FileReader();
f.readAsArrayBuffer(b);
f.onloadend = function(a){
var warcMD5 = faultylabs.MD5(this.result);
var fd = new FormData();
fd.append('warc_file', b)
fd.append('warc_checksum_md5', warcMD5.toLowerCase());
uploadData(fd);
}
I guess the result from a binary string and from a buffer array is different, that's why also the MD5 is inconsistent.

Related

JavaScript Blob to FormData with SAPUI5 mobile app

I know there are several threads about this topic, but I was not able to identify the problem in my case.
I have an application, where I upload an image to an endpoint-URL and after processing I'll receive a response. Works fine so far. The file is contained within a formdata object when using FileUploader-Control from SAPUI5.
When switching from file upload to "taking a picture with smartphone-camera", I dont have a file, I have an base64 dataurl (XString) image object.
var oImage = "…8ryQAbwUjsV5VUaAX/y+YSPJii2Z9GAAAAABJRU5ErkJggg=="} // some lines are missing > 1 million lines
I thought converting it to blob and appending it to FormData might be the solution, but it does not work at all.
var blob = this.toBlob(oImage)
console.log("Blob", blob); // --> Blob(857809) {size: 857809, type: "image/png"} size: 857809 type: "image/png" __proto__: Blob
var formData = new window.FormData();
formData.append("files", blob, "test.png");
console.log("FormData", formData); // seems empty --> FormData {}__proto__: FormData
Functions (works fine from my perspective)
toBlob: function dataURItoBlob(dataURI) {
var byteString = atob(dataURI.split(',')[1]);
var mimeString = dataURI.split(',')[0].split(':')[1].split(';')[0]
var ab = new ArrayBuffer(byteString.length);
var ia = new Uint8Array(ab);
for (var i = 0; i < byteString.length; i++) {
ia[i] = byteString.charCodeAt(i);
}
var bb = new Blob([ab], {
"type": mimeString
});
return bb;
},
This is my problem, FormData is empty and my POST-request throws an undefined error (Loading of data failed: TypeError: Cannot read property 'status' of undefined at constructor.eval (...m/resources/sap/ui/core/library-preload.js?eval:2183:566))
//Create JSON Model with URL
var oModel = new sap.ui.model.json.JSONModel();
var sHeaders = {
"content-type": "multipart/form-data; boundary=---011000010111000001101001",
"APIKey": "<<myKey>>"
};
var oData = {
formData
};
oModel.loadData("/my-destination/service", oData, true, "POST", null, false, sHeaders);
oModel.attachRequestCompleted(function (oEvent) {
var oData = oEvent.getSource().oData;
console.log("Final Response XHR: ", oData);
});
Thanks for any hint
The upload collection is a complex standard control that can be used for attachment management. On desktop it opens a file dialog, on mobile it opens the ios or android photo options, which means picking a photo from the camera roll, or taking a new photo.
Fairly basic example, including the upload URL's and other handlers you'll need. More options are available, adjust to suit your needs. In your XML:
<UploadCollection
uploadUrl="{path:'Key',formatter:'.headerUrl'}/Attachments"
items="{Attachments}"
change="onAttachUploadChange"
fileDeleted="onAttachDelete"
uploadEnabled="true"
uploadComplete="onAttachUploadComplete">
<UploadCollectionItem
documentId="{DocID}"
contributor="{CreatedBy}"
fileName="{ComponentName}"
fileSize="{path:'ComponentSize',formatter:'.formatter.parseFloat'}"
mimeType="{MIMEType}"
thumbnailUrl="{parts:[{path:'MIMEType'},{path:'DocID'}],formatter:'.thumbnailURL'}"
uploadedDate="{path:'CreatedAt', formatter:'.formatter.Date'}" url="{path:'DocID',formatter:'.attachmentURL'}" visibleEdit="false"
visibleDelete="true" />
</UploadCollection>
Here's the handlers. Especially the onAttachUploadChange is important. I should mention there's no explicit post. If the uploadUrl is set correctly a post is triggered anyway.
onAttachUploadChange: function(oEvent) {
var csrf = this.getModel().getSecurityToken();
var oUploader = oEvent.getSource();
var fileName = oEvent.getParameter('files')[0].name;
oUploader.removeAllHeaderParameters();
oUploader.insertHeaderParameter(new UploadCollectionParameter({
name: 'x-csrf-token',
value: csrf
}));
oUploader.insertHeaderParameter(new UploadCollectionParameter({
name: 'Slug',
value: fileName
}));
},
onAttachDelete: function(oEvent) {
var id = oEvent.getParameter('documentId');
var oModel = this.getModel();
//set busy indicator maybe?
oModel.remove(`/Attachments('${encodeURIComponent(id)}')`, {
success: (odata, response) => {
//successful removal
//oModel.refresh();
},
error: err => console.log(err)
});
},
onAttachUploadComplete: function(oEvent) {
var mParams = oEvent.getParameter('mParameters');
//handle errors an success in here. Check `mParams`.
}
as for the formatters to determine URLs, that depends on your setup. In the case below, the stream is set up on the current binding contect, in which case this is one way to do it. You'll need the whole uri so including the /sap/opu/... etc bits.
headerUrl: function() {
return this.getModel().sServiceUrl + this.getView().getBindingContext().getPath()
},
URL for attachments is similar, but generally points to an entity of the attachment service itself.
attachmentURL: function(docid) {
return this.getModel().sServiceUrl + "/Attachments('" + docid + "')/$value";
},
You could fancy it up to check if it's an image, in which case you could include the mime type to show a thumbnail.
There might be better ways of doing this, but I've found this fairly flexible...

Convert Blob data to Raw buffer in javascript or node

I am using a plugin jsPDF which generates PDF and saves it to local file system. Now in jsPDF.js, there is some piece of code which generates pdf data in blob format as:-
var blob = new Blob([array], {type: "application/pdf"});
and further saves the blob data to local file system. Now instead of saving I need to print the PDF using plugin node-printer.
Here is some sample code to do so
var fs = require('fs'),
var dataToPrinter;
fs.readFile('/home/ubuntu/test.pdf', function(err, data){
dataToPrinter = data;
}
var printer = require("../lib");
printer.printDirect({
data: dataToPrinter,
printer:'Deskjet_3540',
type: 'PDF',
success: function(id) {
console.log('printed with id ' + id);
},
error: function(err) {
console.error('error on printing: ' + err);
}
})
The fs.readFile() reads the PDF file and generates data in raw buffer format.
Now what I want is to convert the 'Blob' data into 'raw buffer' so that I can print the PDF.
If you are not using NodeJS then you should know that the browser does not have a Buffer class implementation and you are probably compiling your code to browser-specific environment on something like browserify. In that case you need this library that converts your blob into a Buffer class that is supposed to be as perfectly equal to a NodeJS Buffer object as possible (the implementation is at feross/buffer).
If you are using node-fetch (not OP's case) then you probably got a blob from a response object:
const fetch = require("node-fetch");
const response = await fetch("http://www.stackoverflow.com/");
const blob = await response.blob();
This blob is an internal implementation and exists only inside node-fetch or fetch-blob libraries, to convert it to a native NodeJS Buffer object you need to transform it to an arrayBuffer first:
const arrayBuffer = await blob.arrayBuffer();
const buffer = Buffer.from(arrayBuffer);
This buffer object can then be used on things such as file writes and server responses.
For me, it worked with the following:
const buffer=Buffer.from(blob,'binary');
So, this buffer can be stored in Google Cloud Storage and local disk with fs node package.
I used blob file, to send data from client to server through ddp protocol (Meteor), so, when this file arrives to server I convert it to buffer in order to store it.
var blob = new Blob([array], {type: "application/pdf"});
var arrayBuffer, uint8Array;
var fileReader = new FileReader();
fileReader.onload = function() {
arrayBuffer = this.result;
uint8Array = new Uint8Array(arrayBuffer);
var printer = require("./js/controller/lib");
printer.printDirect({
data: uint8Array,
printer:'Deskjet_3540',
type: 'PDF',
success: function(id) {
console.log('printed with id ' + id);
},
error: function(err) {
console.error('error on printing: ' + err);
}
})
};
fileReader.readAsArrayBuffer(blob);
This is the final code which worked for me. The printer accepts uint8Array encoding format.
Try:
var blob = new Blob([array], {type: "application/pdf"});
var buffer = new Buffer(blob, "binary");

gzip form input files

I am extremely new to javascript/web programming. The main use of my upload form are csv files mostly. I am already using pako to gzip my json (in the request url).
How can I gzip the files before they are sent to the server?
This is roughly how I construct the formdata
$.each($("input[type=file]"), function(i, obj) {
$.each(obj.files, function(j, file) {
formData.append(obj.name, file); // we need to gzip the data
})
});
Edit1: I've managed to (I think) gzip the files using pako, but there's 1 issue - async problems. This is my new code:
$.each($("input[type=file]"), function(i, obj) {
$.each(obj.files, function(j, file) {
formData.append(obj.name, file); // we need to gzip the data
var r = new FileReader();
r.onload = function(){
var zippedResult = pako.gzip(r.result);
var oMyBlob = new Blob(zippedResult, {type : file.type}); // the blob
formData.append(obj.name, oMyBlob); // we need to gzip the data
};
r.readAsArrayBuffer(file);
})
});
// Time to send the formData!
$.ajax({......
As you can see the issue happens since the onload function is only ran after ajax has executed, so the formData is blank
edit2: I'm attempting to create a onchange event for the input files, so this is what I have come up with so far. There is a problem though - it doesn't seems to be zipping correctly. Data type issues?
$("input[type=file]").change(function (event){
var fileList = this.files;
$.each(fileList,function(i,file){
var r = new FileReader();
r.onload = function(){
var zippedResult = pako.gzip(r.result);
var oMyBlob = new Blob(zippedResult, {type : file.type});
app.formData.append(event.target.name, oMyBlob, file.name);
};
r.readAsArrayBuffer(file);
});
});
This is what I've done - note that the formData is a global variable. Take note to clear the formData when you've submitted, if not it will just keep increasing. Also if you re-select a file, it will be appended onto the form (which might not be what you want) - I have not yet found a way around it.
$("input[type=file]").change(function (event){
var fileList = this.files;
$.each(fileList,function(i,file){
var r = new FileReader();
r.onload = function(){
var convertedData = new Uint8Array(r.result);
// Zipping Uint8Array to Uint8Array
var zippedResult = pako.gzip(convertedData, {to : "Uint8Array"});
// Need to convert back Uint8Array to ArrayBuffer for blob
var convertedZipped = zippedResult.buffer;
var arrayBlob = new Array(1);
arrayBlob[0] = convertedZipped;
// Creating a blob file with array of ArrayBuffer
var oMyBlob = new Blob(arrayBlob , {type : file.type} ); // the blob (we need to set file.type if not it defaults to application/octet-stream since it's a gzip, up to you)
app.formData.append(event.target.name, oMyBlob, file.name); // we need to gzip the data
};
r.readAsArrayBuffer(file);
});
});
I have added gzip to JSZip.
I need JSZip and gzip for my web page, and JSZip has all the ingredients, but hides them in ways I can't crack
JSZip is is much better designed with the ability to process big files in chunked / streaming mode. I don't think pako alone does that. I am using both ZIP files and gzip for my project, so I figured basing them on the same package would be useful.

Is it possible to save a File object in LocalStorage and then reload a File via FileReader when a user comes back to a page?

For example, say the user loads some very large images or media files in to your web app. When they return you want your app to show what they've previously loaded, but can't keep the actual file data in LocalStorage because the data is too large.
This is NOT possible with localStorage. Data stored in localStorage needs to be one of the primitive types that can be serializable. This does not include the File object.
For example, this will not work as you'd expect:
var el = document.createElement('input');
el.type='file';
el.onchange = function(e) {
localStorage.file = JSON.stringify(this.files[0]);
// LATER ON...
var reader = new FileReader();
reader.onload = function(e) {
var result = this.result; // never reaches here.
};
reader.readAsText(JSON.parse(localStorage.f));
};
document.body.appendChild(el);
The solution is to use a more powerful storage option like writing the file contents to the HTML5 Filesystem or stashing it in IndexedDB.
Technically you can if you just need to save small files in localStorage.
Just base64 that ish and since it's a string... it's localStorage-friendly.
I think localStorage has a ~5MB limit. base64 strings are pretty low file size so this is a feasible way to store small images. If you use this lazy man's way, the downside is you'll have to mind the 5MB limit. I think it could def be a solution depending on your needs.
Yes, this is possible. You can insert whatever information about the file you want into LocalStorage, provided you serialize it to one of the primitive types supported. You can also serialize the whole file into LocalStorage and retrieve that later if you want, but there are limitations on the size of the file depending on browser.
The following shows how to achieve this using two different approaches:
(function () {
// localStorage with image
var storageFiles = JSON.parse(localStorage.getItem("storageFiles")) || {},
elephant = document.getElementById("elephant"),
storageFilesDate = storageFiles.date,
date = new Date(),
todaysDate = (date.getMonth() + 1).toString() + date.getDate().toString();
// Compare date and create localStorage if it's not existing/too old
if (typeof storageFilesDate === "undefined" || storageFilesDate < todaysDate) {
// Take action when the image has loaded
elephant.addEventListener("load", function () {
var imgCanvas = document.createElement("canvas"),
imgContext = imgCanvas.getContext("2d");
// Make sure canvas is as big as the picture
imgCanvas.width = elephant.width;
imgCanvas.height = elephant.height;
// Draw image into canvas element
imgContext.drawImage(elephant, 0, 0, elephant.width, elephant.height);
// Save image as a data URL
storageFiles.elephant = imgCanvas.toDataURL("image/png");
// Set date for localStorage
storageFiles.date = todaysDate;
// Save as JSON in localStorage
try {
localStorage.setItem("storageFiles", JSON.stringify(storageFiles));
}
catch (e) {
console.log("Storage failed: " + e);
}
}, false);
// Set initial image src
elephant.setAttribute("src", "elephant.png");
}
else {
// Use image from localStorage
elephant.setAttribute("src", storageFiles.elephant);
}
// Getting a file through XMLHttpRequest as an arraybuffer and creating a Blob
var rhinoStorage = localStorage.getItem("rhino"),
rhino = document.getElementById("rhino");
if (rhinoStorage) {
// Reuse existing Data URL from localStorage
rhino.setAttribute("src", rhinoStorage);
}
else {
// Create XHR, BlobBuilder and FileReader objects
var xhr = new XMLHttpRequest(),
blob,
fileReader = new FileReader();
xhr.open("GET", "rhino.png", true);
// Set the responseType to arraybuffer. "blob" is an option too, rendering BlobBuilder unnecessary, but the support for "blob" is not widespread enough yet
xhr.responseType = "arraybuffer";
xhr.addEventListener("load", function () {
if (xhr.status === 200) {
// Create a blob from the response
blob = new Blob([xhr.response], {type: "image/png"});
// onload needed since Google Chrome doesn't support addEventListener for FileReader
fileReader.onload = function (evt) {
// Read out file contents as a Data URL
var result = evt.target.result;
// Set image src to Data URL
rhino.setAttribute("src", result);
// Store Data URL in localStorage
try {
localStorage.setItem("rhino", result);
}
catch (e) {
console.log("Storage failed: " + e);
}
};
// Load blob as Data URL
fileReader.readAsDataURL(blob);
}
}, false);
// Send XHR
xhr.send();
}
})();
Source

If there is no way to read a binary file on client-side Javascript, how to parse a text file?

I'm going to put a text file on my ISP's server, e.g. http://home.ISP.net/~foobar/text.txt.
How can I read that with Javascript in the browser e.g. from http://home.ISP.net/~foobar/textreader.html?
I already know that I can't read a binary file that's on a web server from inside the browser.
Thanks.
Since you're gonna read it from the same domain, just use Ajax to load the file.
I think you could read a file in the client-side by just using a file input tag, and loading it with a FileReader object, though I don't know if it's supported on a browser other than Firefox.
You can read whatever file you want in the web server from inside the browser as long as it's in a "protected" path like "web-inf"...
Probably the easiest way to read the file is using the "load" from jQuery
http://api.jquery.com/load/
or the "ajax" one:
http://api.jquery.com/jQuery.ajax/
In my cross-platform app I had the need to get documents from a cloud-sharepoint and store it on the local filesystem via cordova. Here the relevant parts of code that is working for me;
1) Ajax call to get and download the file
$.ajax({
url: fileurl, // full qualified URL to document
dataType: 'text',
mimeType: 'text/plain; charset=x-user-defined',
async: false,
cache: false
}).done(function(data) {
var data = str2ab(data);
writeData(newfile, data);
});
2) Prepare the fetched binary data
function str2ab(str) {
var buf = new ArrayBuffer(str.length); // 2 bytes for each char
var bufView = new Uint8Array(buf);
for (var i=0, strLen=str.length; i<strLen; i++) {
bufView[i] = str.charCodeAt(i);
}
return buf;
}
3) Write data to filesystem using cordova plugin. Hint: it is assumed the filesystem "fs" has been initialized before.
function writeData(path, data)
{
var content, newContent = "";
var newfile = path ;
//
fs.root.getFile(newfile, {create: true, exclusive: false},
function(fileEntry)
{
fileEntry.createWriter(function(fileWriter) {
fileWriter.onwriteend = function(e) {
console.log('Write completed: ' + fileEntry.fullPath);
};
fileWriter.onerror = function(e) {
console.log('Write failed: ' + e.toString());
};
fileWriter.write(data);
}),
function() {
alert("ERROR WRITE FILE");
},
function() {
alert("ERROR");
}
}
);
};
Your HTML document can use an XMLHttpRequest to fetch the text document.
Note that the file could be a binary file (e.g. an image) although parsing it would require some JavaScript trickery (e.g. by treating it as a string of bytes and processing it with string functions like substr).

Categories

Resources