gzip form input files - javascript

I am extremely new to javascript/web programming. The main use of my upload form are csv files mostly. I am already using pako to gzip my json (in the request url).
How can I gzip the files before they are sent to the server?
This is roughly how I construct the formdata
$.each($("input[type=file]"), function(i, obj) {
$.each(obj.files, function(j, file) {
formData.append(obj.name, file); // we need to gzip the data
})
});
Edit1: I've managed to (I think) gzip the files using pako, but there's 1 issue - async problems. This is my new code:
$.each($("input[type=file]"), function(i, obj) {
$.each(obj.files, function(j, file) {
formData.append(obj.name, file); // we need to gzip the data
var r = new FileReader();
r.onload = function(){
var zippedResult = pako.gzip(r.result);
var oMyBlob = new Blob(zippedResult, {type : file.type}); // the blob
formData.append(obj.name, oMyBlob); // we need to gzip the data
};
r.readAsArrayBuffer(file);
})
});
// Time to send the formData!
$.ajax({......
As you can see the issue happens since the onload function is only ran after ajax has executed, so the formData is blank
edit2: I'm attempting to create a onchange event for the input files, so this is what I have come up with so far. There is a problem though - it doesn't seems to be zipping correctly. Data type issues?
$("input[type=file]").change(function (event){
var fileList = this.files;
$.each(fileList,function(i,file){
var r = new FileReader();
r.onload = function(){
var zippedResult = pako.gzip(r.result);
var oMyBlob = new Blob(zippedResult, {type : file.type});
app.formData.append(event.target.name, oMyBlob, file.name);
};
r.readAsArrayBuffer(file);
});
});

This is what I've done - note that the formData is a global variable. Take note to clear the formData when you've submitted, if not it will just keep increasing. Also if you re-select a file, it will be appended onto the form (which might not be what you want) - I have not yet found a way around it.
$("input[type=file]").change(function (event){
var fileList = this.files;
$.each(fileList,function(i,file){
var r = new FileReader();
r.onload = function(){
var convertedData = new Uint8Array(r.result);
// Zipping Uint8Array to Uint8Array
var zippedResult = pako.gzip(convertedData, {to : "Uint8Array"});
// Need to convert back Uint8Array to ArrayBuffer for blob
var convertedZipped = zippedResult.buffer;
var arrayBlob = new Array(1);
arrayBlob[0] = convertedZipped;
// Creating a blob file with array of ArrayBuffer
var oMyBlob = new Blob(arrayBlob , {type : file.type} ); // the blob (we need to set file.type if not it defaults to application/octet-stream since it's a gzip, up to you)
app.formData.append(event.target.name, oMyBlob, file.name); // we need to gzip the data
};
r.readAsArrayBuffer(file);
});
});

I have added gzip to JSZip.
I need JSZip and gzip for my web page, and JSZip has all the ingredients, but hides them in ways I can't crack
JSZip is is much better designed with the ability to process big files in chunked / streaming mode. I don't think pako alone does that. I am using both ZIP files and gzip for my project, so I figured basing them on the same package would be useful.

Related

Sending (JavaScript / Ajax) and receiving (PHP) files without utf8 issues [duplicate]

I'm looking for a way to split up any text/data file on the front end in the browser before being uploaded as multiple files. My limit is 40KB per upload. So if a user uploads a 400KB file, it would split this file into 10 separate chunks or 10 separate files on the front end before uploading it to the server.
Currently, I'm doing it by converting this file into a base64 formatted string, then split this string by 40KB which comes out to 10 separate chunks. From there I upload each chunk as with a filename of chunk-1-of-10, chunk-2-of-10...
When pulling down these files, I just concat all these chunks back and deconvert it from base64 into its file format.
Is there a better way of doing this? Is there a library that handles all of this instead of writing it from scratch? I'm not sure if the base64 route is the best way to do this.
There is no need for reading the content into ram with the FileReader
using base64 will only increase the size of what you need to upload, base64 takes up ~33% more in size
Use Blob.slice to get chunks
blob slices (chunks) will not increase the memory, it will just create a new reference to the old blob with a changed offset and a new size to where it should start reading from.
when fetch sends the data it will be piped directly from the disk to the network without even touching the main thread.
// simulate a file from a input
const file = new File(['a'.repeat(1000000)], 'test.txt')
const chunkSize = 40000
const url = 'https://httpbin.org/post'
for (let start = 0; start < file.size; start += chunkSize) {
const chunk = file.slice(start, start + chunkSize + 1)
const fd = new FormData()
fd.set('data', chunk)
await fetch(url, { method: 'post', body: fd }).then(res => res.text())
}
You could avoid having to base64 encode by using a FileReader and then sending as binary:
const url = 'http://www.example.com/upload';
document.getElementById('file-uploader').addEventListener('change', function(e) {
const size = 40000;
var reader = new FileReader();
var buf;
var file = document.getElementById('file-uploader').files[0];
reader.onload = function(e) {
buf = new Uint8Array(e.target.result);
for (var i = 0; i < buf.length; i += size) {
var fd = new FormData();
fd.append('fname', [file.name, i+1, 'of', buf.length].join('-'));
fd.append('data', new Blob([buf.subarray(i, i + size)]));
var oReq = new XMLHttpRequest();
oReq.open("POST", url, true);
oReq.onload = function (oEvent) {
// Uploaded.
};
oReq.send(fd);
}
}
reader.readAsArrayBuffer(file);
});
<input type="file" id="file-uploader"/>

Split an uploaded file into multiple chunks using javascript

I'm looking for a way to split up any text/data file on the front end in the browser before being uploaded as multiple files. My limit is 40KB per upload. So if a user uploads a 400KB file, it would split this file into 10 separate chunks or 10 separate files on the front end before uploading it to the server.
Currently, I'm doing it by converting this file into a base64 formatted string, then split this string by 40KB which comes out to 10 separate chunks. From there I upload each chunk as with a filename of chunk-1-of-10, chunk-2-of-10...
When pulling down these files, I just concat all these chunks back and deconvert it from base64 into its file format.
Is there a better way of doing this? Is there a library that handles all of this instead of writing it from scratch? I'm not sure if the base64 route is the best way to do this.
There is no need for reading the content into ram with the FileReader
using base64 will only increase the size of what you need to upload, base64 takes up ~33% more in size
Use Blob.slice to get chunks
blob slices (chunks) will not increase the memory, it will just create a new reference to the old blob with a changed offset and a new size to where it should start reading from.
when fetch sends the data it will be piped directly from the disk to the network without even touching the main thread.
// simulate a file from a input
const file = new File(['a'.repeat(1000000)], 'test.txt')
const chunkSize = 40000
const url = 'https://httpbin.org/post'
for (let start = 0; start < file.size; start += chunkSize) {
const chunk = file.slice(start, start + chunkSize + 1)
const fd = new FormData()
fd.set('data', chunk)
await fetch(url, { method: 'post', body: fd }).then(res => res.text())
}
You could avoid having to base64 encode by using a FileReader and then sending as binary:
const url = 'http://www.example.com/upload';
document.getElementById('file-uploader').addEventListener('change', function(e) {
const size = 40000;
var reader = new FileReader();
var buf;
var file = document.getElementById('file-uploader').files[0];
reader.onload = function(e) {
buf = new Uint8Array(e.target.result);
for (var i = 0; i < buf.length; i += size) {
var fd = new FormData();
fd.append('fname', [file.name, i+1, 'of', buf.length].join('-'));
fd.append('data', new Blob([buf.subarray(i, i + size)]));
var oReq = new XMLHttpRequest();
oReq.open("POST", url, true);
oReq.onload = function (oEvent) {
// Uploaded.
};
oReq.send(fd);
}
}
reader.readAsArrayBuffer(file);
});
<input type="file" id="file-uploader"/>

Javascript formdata: encrypt files before appending

I need to modify existing frontend (angular) code that involves uploading files to a server. Now the files need to be encrypted before being uploaded.
The current approach uses FormData to append a number of files and send them in a single request as shown below:
function uploadFiles(wrappers){
var data = new FormData();
// Add each file
for(var i = 0; i < wrappers.length; i++){
var wrapper = wrappers[i];
var file = wrapper.file;
data.append('file_' + i, file);
}
$http.post(uri, data, requestCfg).then(
/*...*
I have been using Forge in other projects, but never in this sort of context and don't really see how to encrypt files on the fly and still append them as FormData contents.
Forge provides an easy API:
var key = forge.random.getBytesSync(16);
var iv = forge.random.getBytesSync(8);
// encrypt some bytes
var cipher = forge.rc2.createEncryptionCipher(key);
cipher.start(iv);
cipher.update(forge.util.createBuffer(someBytes));
cipher.finish();
var encrypted = cipher.output;
The backend recieves files using Formidable and all the file hanlding is already wired. I would thus like to stick to using the existing front-end logic but simply insert the encryption logic. In that, it's not the entire formdata that must be encrypted... I haven't found a good lead yet to approach this.
Suggestions are very welcome!
Ok, found a solution and added the decrypt code as well. This adds a layer of async code.
function appendFile(aFile, idx){
// Encrypt if a key was provided for this protocol test
if(!key){
data.append('dicomfile_' + idx, file);
appendedCount++;
onFileAppended();
}
else{
var reader = new FileReader();
reader.onload = function(){
// 1. Read bytes
var arrayBuffer = reader.result;
var bytes = new Uint8Array(arrayBuffer); // byte array aka uint8
// 2. Encrypt
var cipher = forge.cipher.createCipher('AES-CBC', key);
cipher.start({iv: iv});
cipher.update(forge.util.createBuffer(bytes));
cipher.finish();
// 3. To blob (file extends blob)
var encryptedByteCharacters = cipher.output.getBytes(); // encryptedByteCharacters is similar to an ATOB(b64) output
// var asB64 = forge.util.encode64(encryptedBytes);
// var encryptedByteCharacters = atob(asB64);
// Convert to Blob object
var blob = byteCharsToBlob(encryptedByteCharacters, "application/octet-stream", 512);
// 4. Append blob
data.append('dicomfile_' + idx, blob, file.name);
// Decrypt for the sake of testing
if(true){
var fileReader = new FileReader();
fileReader.onload = function() {
arrayBuffer = this.result;
var bytez = new Uint8Array(arrayBuffer);
var decipher = forge.cipher.createDecipher('AES-CBC', key);
decipher.start({iv: iv});
decipher.update(forge.util.createBuffer(bytez));
decipher.finish();
var decryptedByteCharacters = decipher.output.getBytes();
var truz = bytes === decryptedByteCharacters;
var blob = byteCharsToBlob(decryptedByteCharacters, "application/octet-stream", 512);
data.append('decrypted_' + idx, blob, file.name + '.decrypted');
appendedCount++;
onFileAppended();
};
fileReader.readAsArrayBuffer(blob);
}
else{
// z. Resume processing
appendedCount++;
onFileAppended();
}
}
// Read file
reader.readAsArrayBuffer(aFile);
}
}
function onFileAppended(){
// Only proceed when all files were appended and optionally encrypted (async)
if(appendedCount !== wrappers.length) return;
/* resume processing, upload or do whathever */

JS - How to compute MD5 on binary data

EDIT: changed title from "JS File API - write and read UTF-8 data is inconsistent" to reflect the actual question.
I have some binary content i need to calculate the MD5 of. The content is a WARC file, that means that it holds text as well as encoded images. To avoid errors in the file saving, I convert and store all the data in arrayBuffers. All the data is put in UInt8Arrays to convert it to UTF-8.
My first attempt, for testing, is to use the saveAs library to save files from Chrome extensions. This means I was using a blob object to be passed on to the method and create the file.
var b = new Blob(arrayBuffers, {type: "text/plain;charset=utf-8"});
saveAs(b,'name.warc');
I haven't found a tool to compute the MD5 from a Blob object so what I was doing was using a FileReader to read the blob file as binary data and then use an MD5 tool (I used cryptoJS as well as a tool from faultylabs) to compute the result.
f = new FileReader();
f.readAsBinaryString(b);
f.onloadend = function(a){
console.log( 'Original file checksum: ', faultylabs.MD5(this.result) );
}
The resources (images) are downloaded directly in arraybuffer format so I have no need to convert them.
The result was wrong, meaning that checking the MD5 from the code and checking it from the file I saved on my local machine gave 2 different results. Reading as text, obviously shoots out an error.
The workaround I found, consists in writing the blob object on the disk using the filesystem API and then read it back as binary data, compute the MD5 and then save that retrieved file as WARC file (not directly the blob object but this "refreshed" version of the file).
In this case the computed MD5 is fine ( I calculate it on the "refreshed" version of the warc file) but when I launch the WARC replay instance with the "refreshed" warc archive, it throws me errors - while with the original file I don't have any problem (but the MD5 is not correct).
var fd = new FormData();
// To compute the md5 hash and to have it correct on the server side, we need to write the file to the system, read it back and then calculate the md5 value.
// We need to send this version of the warc file to the server as well.
window.requestFileSystem = window.requestFileSystem || window.webkitRequestFileSystem;
function computeWARC_MD5(callback,formData) {
window.requestFileSystem(window.TEMPORARY, b.size, onInitFs);
function onInitFs(fs) {
fs.root.getFile('warc.warc', {create: true}, function(fileEntry) {
fileEntry.createWriter(function(fileWriter) {
fileWriter.onwriteend = function(e) {
readAndMD5();
};
fileWriter.onerror = function(e) {
console.error('Write failed: ' + e.toString());
};
fileWriter.write(b);
});
});
function readAndMD5() {
fs.root.getFile('warc.warc', {}, function(fileEntry) {
fileEntry.file( function(file) {
var reader = new FileReader();
reader.onloadend = function(e) {
var warcMD5 = faultylabs.MD5( this.result );
console.log(warcMD5);
var g = new Blob([this.result],{type: "text/plain;charset=utf-8"});
saveAs(g, o_request.file);
formData.append('warc_file', g)
formData.append('warc_checksum_md5', warcMD5.toLowerCase());
callback(formData);
};
reader.readAsBinaryString(file);
});
});
}
}
}
function uploadData(formData) {
// upload
$.ajax({
type: 'POST',
url: server_URL_upload,
data: fd,
processData: false,
contentType: false,
// [SPECS] fire a progress event named progress at the XMLHttpRequestUpload object about every 50ms or for every byte transmitted, whichever is least frequent
xhrFields: {
onprogress: function (e) {
if (e.lengthComputable) {
console.log(e.loaded / e.total * 100 + '%');
}
}
}
}).done(function(data) {
console.log('done uploading!');
//displayMessage(port_to_page, 'Upload finished!', 'normal')
//port_to_page.postMessage( { method:"doneUpload" } );
});
}
computeWARC_MD5(uploadData, fd);
saveAs(b, 'warc.warc');
Could anybody explain me why there is this discrepancy? What am I missing in treating all the objects I am dealing with as binary data (store, read)?
Basically I tried another route and converted the blob file back to arraybuffer and computed the MD5 on that. At that point, the file's MD5 and the arraybuffer's are the same.
var b = new Blob(arrayBuffers, {type: "text/plain;charset=utf-8"});
var blobHtml = new Blob( [str2ab(o_request.main_page_html)], {type: "text/plain;charset=utf-8"} );
f = new FileReader();
f.readAsArrayBuffer(b);
f.onloadend = function(a){
var warcMD5 = faultylabs.MD5(this.result);
var fd = new FormData();
fd.append('warc_file', b)
fd.append('warc_checksum_md5', warcMD5.toLowerCase());
uploadData(fd);
}
I guess the result from a binary string and from a buffer array is different, that's why also the MD5 is inconsistent.

How to do file upload in e2e AngularJS tests?

In one of my views, I have a file upload control. It supports file uploading either via drag and drop, or via the standard file dialog opened after a button click.
How to do this in my e2e tests1?
1 Just one of the two options will be enough
You can upload files using Javascript blobs. This requires the FileApi, which isn't compatible with older browsers (http://caniuse.com/fileapi). But since you mentioned using drag and drop uploads, which uses the FileApi, it shouldn't matter too much.
There are two ways you can upload files using the blob API. One is very easy and the other is simply a continuation of the first.
Using Javascript, you can create a new blob with:
var blob = new Blob("content", contentType);
For example, this will create a blob object that contains the text "Hello World!".
var foo = new Blob("Hello World!", {type: "text/plain"});
You could also use the following method is better for non-plaintext files, such as pdf's. You have to convert the file to Base64 (you can use something like this) and create the blob using the Base64 data.
Use this function (a slightly modified version of this) to create the blob.
function b64toBlob(b64Data, contentType, sliceSize) {
b64Data = b64Data.replace(/\s/g, '');
contentType = contentType || '';
sliceSize = sliceSize || 1024;
function charCodeFromCharacter(c) {
return c.charCodeAt(0);
}
var byteCharacters = atob(b64Data);
var byteArrays = [];
for (var offset = 0; offset < byteCharacters.length; offset += sliceSize) {
var slice = byteCharacters.slice(offset, offset + sliceSize);
var byteNumbers = Array.prototype.map.call(slice, charCodeFromCharacter);
var byteArray = new Uint8Array(byteNumbers);
byteArrays.push(byteArray);
}
var blob = new Blob(byteArrays, {type: contentType});
return blob;
}
For example, this will create a PDF blob object.
var pdf = "JVBERi0xLjQKJcfsj6IKNSAwIG9...=="; //base64 encoded file as a String
var pdfBlob = b64toBlob(pdf, "application/pdf", 1024);
After you create the blob with one of the methods above, it can be treated as a file. For example, you could put the file into a FormData object (if you're doing uploads like this):
var fd = new FormData();
fd.append("uploadedFile", pdfBlob, "My PDF.pdf"*);
*Filename parameter only seems to work on Chrome as of now.

Categories

Resources