File Upload & Google Protobuf - javascript

I'm having a hard time figuring out how to send a file over a WebSocket using Google Protocol buffers.
My message structure for the buffer is
message FileData_m {
required uint32 block = 1; // File starting offset
required bytes data = 2; // Size of 65536 for blocks
}
The idea is to break up the file into blocks and send it over a WebSocket. Currently I'm running a Node.js server that handles interactions between host and server I just don't know how to properly send the file in a binary manner.
Any help and/or pointing me in the right direction would be very helpful!

The solution to this was to make sure that my data was being sent using an array buffer
var dataToSend = new ArrayBuffer(65536);
dataToSend = file; // File that was uploaded from file chooser
Then when creating your protobuf message just use that array buffer as your source of data
var FileData = new FileData({
"block": 0,
"data": dataToSend
});

Related

Merging uploaded file chunks in php result in corrupt files

I am developing a file uploader section in my app. The client side is Vue.js and the backend is PHP. Using Laravel as my framework.
I am slicing up the selected files at the client side using Blob.slice() (I have also tried FileReader Api, Resumablejs, Now working on my own implementation). Data is sent using xhr (tried Axios, XMLHTTPRequest), one request per "slice" or "chunk". I fetch the data at the backend and save the incoming file as "chunk1", "chunk2" ... and so on. Upon receiving the last chunk, I merge the chunks using PHP.
My problem is that the merged file somehow corrupts. mp4s - not playable or not seekable, exes - corrupt, some exes do good but not all (its unpredictable), some small pdfs survive.
Failed Attempts
Send sliced data with multipart/form-data
--- save chunk with Storage::put() or Storage::putFileAs()
--- save chunk with fopen(file, 'wb' or 'ab'), fwrite(), fclose()
--- save chunk with file_put_contents
Send sliced data with base64 encoding
--- save chunk as received (base64 encoded) -> read each chunk with base64_decode() while saving data in new file
--- append all chunks as received (base64 encoded) to one file -> later create a new file decoding this appended file. (this attempt was by far the most successful one but still some files corrupted, especially exes).
Client side code ...
upload(file, start = 0, request = 0) {
let chunkSize = 1024 * 1024 * 3;
let end = (start + chunkSize) > file.fileObject.size ? file.fileObject.size : (start + chunkSize);
let reader = new FileReader();
let slice = file.fileObject.slice(start, end);
reader.onload = (e) => {
let data = {
fileName: file.fileObject.name,
chunkNumber: request + 1,
totalChunks: Math.ceil(file.fileObject.size / chunkSize),
chunk: reader.result.split(',')[1]
}
axios({
url: '/api/admin/batch-sessions/'+ this.batchSessionId +'/files',
method: 'POST',
data: data,
headers: {'Content-Type': 'application/json'}
})
.then(res => {
start += chunkSize;
request++;
if (start <= file.fileObject.size) {
this.upload(file, start, request);
}
})
.catch(err => {
console.log(err.message);
});
}
reader.readAsDataURL(slice);
}
Server side code ...
public function handle()
{
$chunks = Storage::disk('s3-upload-queue')
->files($this->directory);
$mergedFile = Storage::disk('s3-upload-queue')->path($this->directory.'/'.basename($this->directory));
$base64File = Storage::disk('s3-upload-queue')->path($this->directory.'/'.basename($this->directory).'.b64');
$mfs = fopen($mergedFile, 'wb');
$b64fs = fopen($base64File, 'r');
fwrite($mfs, base64_decode(fread($b64fs, filesize($base64File))));
fclose($mfs);
fclose($b64fs);
}
Actually I do not have in-depth knowledge about different encodings, was reading about base64 chunking here on stackoverflow and tried to create "slice" of size (1024 * 1024 * 3). this is when most files were merged successfully using base64 encoded transfer. but that too was unpredictable. some files still corrupted. I am trying to understand this properly.
Please let me know if more info is needed. Thanks.
chunk: reader.result.split(',')[1]
What is it? Are you removing base64 prefix? It does not exists in chunks, so your chunk is always empty as there is nothing on index "1", as string is not splitted.
I would also use such code to get blob:
let blob = file.slice(offset, length + offset);
file.fileReader.readAsBinaryString(blob);
file.fileReader.onload = async function() {
let base64 = btoa(this.result)
// Do whatever you want
}
About corruptions, also wonder how do you know if all chunks were send successfully? Do you keap log of chunks on server side, not only on client side?

File Uploading ReadAsDataUrl

I have a question about the File API and uploading files in JavaScript and how I should do this.
I have already utilized a file uploader that was quite simple, it simply took the files from an input and made a request to the server, the server then handled the files and uploaded a copy file on the server in an uploads directory.
However, I am trying to give people to option to preview a file before uploading it. So I took advantage of the File API, specifically the new FileReader() and the following readAsDataURL().
The file object has a list of properties such as .size and .lastModifiedDate and I added the readAsDataURL() output to my file object as a property for easy access in my Angular ng-repeat().
My question is, it occurred to me as I was doing this that I could store the dataurl in a database rather than upload the actual file? I was unsure if modifying the File data directly with it's dataurl as a property would affect its transfer.
What is the best practice? Is it better to upload a file or can you just store the dataurl and then output that, since that is essentially the file itself? Should I not modify the file object directly?
Thank you.
Edit: I should also note that this is a project for a customer that wants it to be hard for users to simply take uploaded content from the application and save it and then redistribute it. Would saving the files are urls in a database mitigate against right-click-save-as behavior or not really?
There is more then one way to preview a file. first is dataURL with filereader as you mention. but there is also the URL.createObjectURL which is faster
Decoding and encoding to and from base64 will take longer, it needs more calculations, more cpu/memory then if it would be in binary format.
Which i can demonstrate below
var url = 'https://upload.wikimedia.org/wikipedia/commons/c/cc/ESC_large_ISS022_ISS022-E-11387-edit_01.JPG'
fetch(url).then(res => res.blob()).then(blob => {
// Simulates a file as if you where to upload it throght a file input and listen for on change
var files = [blob]
var img = new Image
var t = performance.now()
var fr = new FileReader
img.onload = () => {
// show it...
// $('body').append(img)
var ms = performance.now() - t
document.body.innerHTML = `it took ${ms.toFixed(0)}ms to load the image with FileReader<br>`
// Now create a Object url instead of using base64 that takes time to
// 1 encode blob to base64
// 2 decode it back again from base64 to binary
var t2 = performance.now()
var img2 = new Image
img2.onload = () => {
// show it...
// $('body').append(img)
var ms2 = performance.now() - t2
document.body.innerHTML += `it took ${ms2.toFixed(0)}ms to load the image with URL.createObjectURL<br><br>`
document.body.innerHTML += `URL.createObjectURL was ${(ms - ms2).toFixed(0)}ms faster`
}
img2.src = URL.createObjectURL(files[0])
}
fr.onload = () => (img.src = fr.result)
fr.readAsDataURL(files[0])
})
The base64 will be ~3x larger. For mobile devices I think you would want to save bandwidth and battery.
But then there is also the latency of doing a extra request but that's where http 2 comes to rescue

JavaScript - load PDF file from URL (cross domain) into variable

I would like to load PDF file from URL into JavaScript variable (this file is on another domain) and then print the base64 encoded string of that file.
This script allows me to browse file on my computer and then it prints base64 string into browser console:
<input id="inputFile" type="file" onchange="convertToBase64();" />
<script type="text/javascript">
function convertToBase64() {
//Read File
var selectedFile = document.getElementById("inputFile").files;
//Check File is not Empty
if (selectedFile.length > 0) {
// Select the very first file from list
var fileToLoad = selectedFile[0];
// FileReader function for read the file.
var fileReader = new FileReader();
var base64;
// Onload of file read the file content
fileReader.onload = function(fileLoadedEvent) {
base64 = fileLoadedEvent.target.result;
// Print data in console
console.log(base64);
};
// Convert data to base64
fileReader.readAsDataURL(fileToLoad);
}
}
</script>
I would like to completely remove the input button from this script and pass my file to variable var selectedFile from URL (for example: http://www.example.com/docs/document.pdf).
I'd need a help how to realize this, because I am not sure if XMLHttpRequest() works cross domain and scripts I've found with Ajax/jQuery method operated mainly with JSON file, which is something different that I need.
Thank you very much for help.
You cannot do this in normal browser-based JavaScript* if the other side (http://www.example.com in your case) doesn't allow cross-origin requests from your origin.
If the other side does let you do this, then yes, you'd use XMLHttpRequest (or jQuery's wrappers for it, such as ajax or get) to request the data and transform/display it as you see fit.
A fairly typical way to work around that if the other side doesn't is to use your own server in-between: Make the request to your server, have it make the request to the other side (server-side code doesn't have the Same Origin Policy blocks that browsers impose), and then have your server respond to your request with the data from the other server.
* "normal browser-based JavaScript" - e.g., without starting the browser with special flags that disable security, or getting people to install an extension, etc.

nodejs binary websocket mimetype handling

i'm not 100% sure but from what i read when i send a blob (binary data) over websocket, the blob does not contain any file information. (Also the official specification states that wesockets only send the raw binary)
the filesize
the mimetype
user info (explain later)
i'm using https://github.com/websockets/ws
Testing:
Sending directly the blob from an input file.
ws.send(this.files[0]) //this should already contain the info
Creating a new blob with the native javascript api from file setting the proper mimetype.
ws.send(new Blob([this.files[0]],{type:this.files[0].type})); //also this
on both sides you can get only the effective blob without any other information.
Is it possible to append let's say a 4kb predefined json data converted also to binary that contains important information like the mimetype and the filesize,
and then just split off the 4kb when needed?
{"mime":"txt/plain","size":345}____________4KB_REST_OF_THE_BINARY
OR
ws.send({"mime":"txt\/plain","size":345})
ws.send(this.files[0])
Even if the first one is the worst solution ever it would allow me to send everything in one time.
The second one has a big problem:
it's a chat that allows to send also files like documents,images,music videos.
i could write some sort of handshaking system when sending the file/user info before i send the binary data.
BUT
if another person sends also a file, as it's async, the handshaking system has no chance to determine wich file is the right one for the correct user and mimetype.
So how do you properly send a binary file in a multiuser async envoirement?
i know i can convert to base64 but thats 30% bigger.
btw. Totally disappointed with Apple... while chrome shows every binary data properly, my ios devices are not able to handle blob's, only images will show in blob or base64 format, not even a simple txt file. Basically only a <img> tag can read dynamic files.
How everything works (now):
user sends a file
nodejs gets the binary data, also user info... but not mimetype,filename,size.
nodejs broadcasts the raw binary file to all the users.(can't specify user & file info)
clients create a bloburl (who send that? XD).
EDIT
what i have now:
client 1 (sends a file)CHROME
fileInput.addEventListener('change',function(e){
var file=this.files[0];
ws.send(new Blob([file],{
type:file.type //<- SET MIMETYPE
}));
//file.size
},false);
note: file is already a blob ... but this is how you would normally create a new blob specifying the mimetype.
server (broadcasts the binary data to the other clients)NODEJS
aaaaaand the mimetype is gone...
ws.addListener('message',function(binary){
var b=0,c=wss.clients.length;
while(b<c){
wss.clients[b++].send(binary)
}
});
client 2 (recieves the binary)CHROME
ws.addEventListener('message',function(msg){
var blob=new Blob([msg.data],{
type:'application/octet-stream' //<- LOST
});
var file=window.URL.createObjectURL(blob);
},false);
note: m.data is already a blob ... but this is how you would normally create a new blob specifying the mimetype witch is lost.
In client 2 i need the mimetype and naturally i also need the info about the user, wich can be retrieved from client 1 or the server (not a good choice)...
You're a bit out of luck with this because Node doesn't support the Blob interface and so any data you send or receive in Binary with Node is just Binary. You would have to have something that knew how to interpret a Blob object.
Here's an idea, and let me know if this works. Reading through the documentation for websockets\ws it says it supports sending and receiving ArrayBuffers. Which means you can use TypedArrays.
Here's where it gets nasty. You set a certain fixed n number of bytes at the beginning of every TypedArray to signal the mime type encoded in utf8 or what have you, and the rest of your TypedArray contains your file's bytes.
I would recommend using UInt8Array because utf8 characters are 8 bits long and your text will probably be readable when encoded that way. As for the file bits you'll probably just end up writing those down somewhere and appending an ending to it.
Also note, this method of interpretation works both ways whether from Node or in the Browser.
This solution is really just a form of type casting and you might get some unexpected results. The fixed length of your mime type field is crucial.
Here it is illustrated. Copy, paste, set the image file to whatever you want and then run that. You'll see the mime type I set pop out.
var fs = require('fs');
//https://stackoverflow.com/questions/8609289/convert-a-binary-nodejs-buffer-to-javascript-arraybuffer
function toUint8Array(buffer) {
var ab = new ArrayBuffer(buffer.length);
var array = new Uint8Array(ab);
for (var i = 0; i < buffer.length; ++i) {
array[i] = buffer[i];
}
return array;
}
//data is a raw Buffer object
fs.readFile('./ducklings.png', function (err, data) {
var mime = new Buffer('image/png');
var allBuffed = Buffer.concat([mime, data]);
var array = toUint8Array(allBuffed);
var mimeBytes = array.subarray(0,9); //number of characters in mime Buffer
console.log(String.fromCharCode.apply(null, mimeBytes));
});
Here's how you do it on the client side:
SOLUTION A: GET A PACKAGE
Get buffer, an implementation of Node's Buffer API for browsers. The solution to concatenate Byte buffers will work exactly as before. You can append fields like To: and what not as well. The way you format your headers in order to best serve your clients will be an evolving process I'm sure.
SOLUTION B: OLD SCHOOL
STEP 1: Convert your Blob to an ArrayBuffer
Notes: How to convert a String to an ArrayBuffer
var fr = new FileReader();
fr.addEventListener('loadend', function () {
//Asynchronous action in part 2.
var message = concatenateBuffers(headerStringAsBuffer, fr.result);
ws.send(message);
});
fr.readAsArrayBuffer(blob);
STEP 2: Concatenate ArrayBuffers
function concatenateBuffers(buffA, buffB) {
var byteLength = buffA.byteLength + buffB.byteLength;
var resultBuffer = new ArrayBuffer(byteLength);
//wrap ArrayBuffer in a typedArray/view
var resultView = new Uint8Array(resultBuffer);
var viewA = new Uint8Array(resultBuffer);
var viewB = new Uint8Array(resultBuffer);
//Copy 8 bit integers AKA Bytes
resultView.set(viewA);
resultView.set(viewB, viewA.byteLength);
return resultView.buffer
}
STEP 3: Receive and Reblob
I'm not going to repeat how to convert the concatenated String bytes back into a string because I've done it in the server example, but for turning the file bytes into a blob of your mime type is fairly simple.
new Blob(buffer.slice(offset, buffer.byteLength), {type: mimetype});
This Gist by robnyman goes into further details on how you would use an image transmitted via XHR, put it into localstorage, and use it in an image tag on your page.
I liked #Breedly's idea of prepending a fixed length byte array to indicate mime type of the ArrayBuffer so I created this npm package that I use when dealing with websockets but maybe others' might find it useful.
Example usage
const {
arrayBufferWithMime,
arrayBufferMimeDecouple
} = require('arraybuffer-mime')
// some image array buffer
const uint8 = new Uint8Array(1)
uint8[0] = 1
const ab = uint8.buffer
const mime = 'image/png'
const abWithMime = arrayBufferWithMime(ab, mime)
const {mime, arrayBuffer} = arrayBufferMimeDecouple(abWithMime)
console.log(mime) // "image/png"
console.log(arrayBuffer) // ArrayBuffer

Saving binary file from base64 text data by IE, ADODB.Stream to hard disk

I had a problem when try to saving a binary file by IE, ADODB.Stream on client.
Assume the the client's browser has full permission to write file to hard disk.
Here is my client code witten by javascript.:
var base64_encoded_string = "base64 string of a binary file";
var data = window.atob(base64_encoded_string);
var stream = new ActiveXObject("ADODB.Stream");
stream.Type = 1; // text
stream.Open();
stream.Write(data); //Problem in here
stream.SaveToFile("D:\\Whatever.pfx", 2);
stream.Close();
As I marked problem come from writing binary data. I always got error:
"Arguments are of the wrong type, are out of acceptable range, or are in conflict with one another"
Whatever I format or change data variable to array of bytes, blob ...
Please help me how to input data to write binary file in this circumstance.
You need to use WriteText
stream.WriteText(data);

Categories

Resources