FileReader fails reading big blobs - javascript

I've encountered a problem with FileReader and reading quite big Blobs.
const size = 50; //MB
const blob = new Blob([new ArrayBuffer(size*1024*1024)], {type: 'application/octet-string'});
console.log(blob.size);
const reader = new FileReader();
reader.onload = function(e) {
console.log(new Uint8Array(e.target.result));
};
reader.readAsArrayBuffer(blob.slice(0, 1024));
https://jsfiddle.net/aas8gmo2/
The example above shows that onload function is not called every time (if it is, increase size of the Blob to 100/200/300 MB). The problem is reproducible only under Chrome (tested under 53.0.2785.143)
Any hints what could be wrong?

Last time I used Chrome, there was a hard cap of around 500mb for a single blob size.
Also, given these threads: https://bugs.chromium.org/p/chromium/issues/detail?id=375297 and https://github.com/streamproc/MediaStreamRecorder/issues/86
It appears that memory is not properly cleared when creating several small blobs and you might need to reload the page to be able to go on.
(That would also explain why several tries might be needed on the JSFiddle).
So for now, deceiving answer, but it seems you're gonna have to find a workaround...or dive into Chrome's source code.

Related

FileReader memory leak in Chrome

I have a webpage with file upload functionality. The upload is performed in 5MB chunks. I want to calculate hash for each chunk before sending it to the server. The chunks are represented by Blob objects. In order to calculate the hash I am reading such blob into an ArrayBuffer using a native FileReader. Here is the code:
var reader = new FileReader();
var getHash = function (blob, callback) {
reader.onloadend = function (e) {
var hash = util.hash(e.target.result);
callback(hash);
};
reader.readAsArrayBuffer(blob);
}
var processChunk = function (chunk) {
if (chunk) {
getHash(chunk, function (hash) {
util.sendToServer(chunk, hash, function() {
// this callback is called when chunk upload is finished
processChunk(chunks.shift());
});
});
}
}
var chunks = file.splitIntoChunks(); // gets an array of blobs
processChunk(chunks.shift());
The problem: using the FileReader.readAsArrayBuffer seems to eat up a lot of memory which is not released. So far I tested with a 5GB file on following browsers:
Chrome 55.0.2883.87 m (64-bit): the memory goes up to 1-2GB quickly and oscillates around that. Sometimes it goes all the way up and browser tab crashes. It can use more memory than the size of read chunks. E.g. after reading 500MB of chunks the process already uses 700MB of memory.
Firefox 50.1.0: memory usage oscillates around 300-600MB
Code adjustments I have tried - all to no avail:
re-using the same FileReader instance for all chunks (as suggested in this question)
creating new FileReader for each chunk
adding timeout before starting new chunk
setting the FileReader and the ArrayBuffer to null after each read
The question: is there a way to fix the problem? Is this a bug in the FileReader implementations or am I doing something wrong?
EDIT: Here is a JSFiddle https://jsfiddle.net/andy250/pjt9udeu/
This is a bug in Chrome on Windows. It is reported here: https://bugs.chromium.org/p/chromium/issues/detail?id=674903

how to correctly convert pdf file to base64 in browser?

I have three failing versions of the following code in a chrome extension, which attempts to intercept a click to a link pointing to a pdf file, fetch that file, convert it to base64, and then log it. But I'm afraid I don't really know anything about binary formats and encodings, so I'm royally sucking this up.
var links = document.getElementsByTagName("a");
function transform(blob) {
return btoa(String.fromCharCode.apply(null, new Uint8Array(blob)));
};
function getlink(link) {
var x = new XMLHttpRequest();
x.open("GET", link, true);
x.responseType = 'blob';
x.onload = function(e) {
console.log("Raw response:");
console.log(x.response);
console.log("Direct transformation:");
console.log(btoa(x.response));
console.log("Mysterious thing I got from SO:");
console.log(transform(x.response));
window.location.href = link;
};
x.onerror = function (e) {
console.error(x.statusText);
};
x.send(null);
};
for (i = 0, len = links.length; i < len; i++) {
var l = links[i]
l.addEventListener("click", function(e) {
e.preventDefault();
e.stopPropagation();
e.stopImmediatePropagation();
getlink(this.href);
}, false);
};
Version 1 doesn't have the call to x.responseType, or the call to transform. It was my original, naive, implementation. It threw an error: "The string to be encoded contains characters outside of the Latin1 range."
After googling that error, I found this prior SO, which suggests that in parsing an image:
The response type needs to be set to blob. So this code does that.
There's some weird line, I don't know what it does at all: String.fromCharCode.apply(null, new Uint8Array(blob)).
Because I know nothing about binary formats, I guessed, probably stupidly, that making a PDF base64 would be the same as making some random image format base64. So, in fine SO tradition, I copied code that I don't really understand. In stages.
Version 2 of the code just set the response type to blob but didn't try the second transformation. And the code worked, and logged something that looked like a base64 string, but a clearly incorrect string. In its entirety, it logged:
W29iamVjdCBCbG9iXQ==
Which is just goofily wrong. It's obviously too short for a 46k pdf file, and a reference base64 encoding I created with python from the commandline was much much much longer, as one would expect.
Version 3 of the code then also applies the mysterious transformation using stringFromCharCode and all the rest, which I shoved into the transform function.
However, that doesn't log anything at all---a blank line appears in the console in its appropriate place. No errors, no nonsense output, just a blank line.
I know I'm getting the correct file from prior testing. Also, the call to log the raw response object produces Blob {size: 45587, type: "application/pdf"}, which is the correct filesize for the pdf I'm experimenting with, so the blob actually contains what it should when it gets into the browser.
I'm using, and only need to support, a current version of chrome.
Can someone tell me what I'm doing wrong?
Thanks!
If you only need to support modern browsers, you should also be able to use FileReader#readAsDataURL.
That would let you do something like this:
var reader = new FileReader();
reader.addEventListener("load", function () {
console.log(reader.result);
}, false);
// The function accepts Blobs and Files
reader.readAsDataURL(x.response);
This logs a data URI, which will contain your base64 data.
I think I've found my own solution. The response type needs to be arraybuffer not blob.

File Upload shows data with empty strings on Safari and Chrome iOS8.1.2

Hello we are having issues with File Upload on Chrome and Safari on IOS 8.1.2. Some photo upload shows data with an empty string, others are displaying OK. Anyone knows a workaround so the uploaded photos can be consistently displayed? Thanks so much.
Initially we thought that it was a known issue with File Upload featuring broken on iOS 8 Safari. "http://blog.uploadcare.com/you-cannot-upload-files-to-a-server-using-mobile-safari/". However, It appeared to be problem with 8.0.0 and has supposedly been fixed. Also, the problem is not limited to Safari and appears in Chrome iOS as well.
Specifically, when a photo chosen was taken directly from the iPHONE camera, the data appears to be empty (see log in console)
[Log] Object (controllers.js, line 228)
src: "data:,"
However, when a photo chosen is either a Screen Shot or a photo saved from an email, the image is in fact displayed and the data is sent
[Log] Object (controllers.js, line 228)
src: "...."
Has anyone encountered a similar issue? Why does it work with some photos but not others? Anyone knows any workaround to display images that have been taken from the Camera itself?
Snippets of codes here:
$scope.getAlbumPicture = function() {
Camera.getAlbumPicture().then(function(fileURI) {
$scope.normalisePicture(fileURI, function(dataURL) {
Local.setTemp(dataURL);
$state.go('tab.camera-detail');
});
}, function(err) {
console.log(err);
});
};
$scope.readImage = function(input) {
if (input.files && input.files[0]) {
var fileReader = new FileReader();
fileReader.onloadend = function() {
$scope.normalisePicture(fileReader.result, function(dataURL) {
Local.setTemp(dataURL);
$state.go('tab.camera-detail');
});
};
fileReader.readAsDataURL(input.files[0]);
}
};
TL;DR Recently I encountered similar issue as described in question, found this thread, it may has been a low memory issue with iPhone, and not much work around you can do but resize the big size image with canvas element, I using 1200 as the longest width/height for the image to resize then everything works again.
for any feature reference, more specific hardware and software the problem occurred with my test is iPhone 6 Plus and iOS 8.1.2, and the process I tried to do with images is:
using FileReader.onLoad = func with FileReader.readAsDataURL()to read input[type="file"]'s file as base64 string.
assign base64 as new image()'s src
inside image.onload = func draw a new canvas's 2d context with the same with and height as image above.
check some orientation EXIF info then rotate the canvas.
finally do other staffs with base64 string of new image which generated form canvas.toDataURL().
The empty data: occurs in both between step 1 to step 2 and step 5 randomly depends on the size of image got from `input[type="file"].
In my case, iOS Safari here seems being okay to assign image data to canvas with same width and height, but not okay to toDataURL for the canvas if it has a very big size. But Safari won't hang anyway, I can re-tap input[type="file"] to get next file since the empty data: outputted from previous one.

Data URI leak in Safari (was: Memory Leak with HTML5 canvas)

I have created a webpage that receives base64 encoded bitmaps over a Websocket and then draws them to a canvas. It works perfectly. Except, the browser's (whether Firefox, Chrome, or Safari) memory usage increases with each image and never goes down. So, there must be a memory leak in my code or some other bug. If I comment out the call to context.drawImage, the memory leak does not occur (but then of course the image is never drawn). Below are snippets from my webpage. Any help is appreciated. Thanks!
// global variables
var canvas;
var context;
...
ws.onmessage = function(evt)
{
var received_msg = evt.data;
var display_image = new Image();
display_image.onload = function ()
{
context.drawImage(this, 0, 0);
}
display_image.src = 'data:image/bmp;base64,'+received_msg;
}
...
canvas=document.getElementById('ImageCanvas');
context=canvas.getContext('2d');
...
<canvas id="ImageCanvas" width="430" height="330"></canvas>
UPDATE 12/19/2011
I can work around this problem by dynamically creating/destroying the canvas every 100 images or so with createElement/appendChild and removeChild. After that, I have no more memory problems with Firefox and Chrome.
However, Safari still has a memory usage problem, but I think it is a different problem, unrelated to Canvas. There seems to be an issue with repeatedly changing the "src" of the image in Safari, as if it will never free this memory.
display_image.src = 'data:image/bmp;base64,'+received_msg;
This is the same problem described on the following site: http://waldheinz.de/2010/06/webkit-leaks-data-uris/
UPDATE 12/21/2011
I was hoping to get around this Safari problem by converting my received base64 string to a blob (with a "dataURItoBlob" function that I found on this site) and back to a URL with window.URL.createObjectURL, setting my image src to this URL, and then later freeing the memory by calling window.URL.revokeObjectURL. I got this all working, and Chrome and Firefox display the images correctly. Unfortunately, Safari does not appear to have support for BlobBuilder, so it is not a solution I can use. This is strange, since many places including the O'Reilly "Programming HTML5 Applications" book state that BlobBuilder is supported in Safari/WebKit Nightly Builds. I downloaded the latest Windows nightly build from http://nightly.webkit.org/ and ran WebKit.exe but BlobBuilder and WebKitBlobBuilder are still undefined.
UPDATE 01/03/2012
Ok, I finally fixed this by decoding the base64-encoded data URI string with atob() and then creating a pixel data array and writing it to the canvas with putImageData (see http://beej.us/blog/2010/02/html5s-canvas-part-ii-pixel-manipulation/). Doing it this way (as opposed to constantly modifying an image's "src" and calling drawImage in the onload function), I no longer see a memory leak in Safari or any browser.
Without actual working code we can only speculate as to why.
If you're sending the same image over and over you're making a new image every time. This is bad. You'd want to do something like this:
var images = {}; // a map of all the images
ws.onmessage = function(evt)
{
var received_msg = evt.data;
var display_image;
var src = 'data:image/bmp;base64,'+received_msg;
// We've got two distinct scenarios here for images coming over the line:
if (images[src] !== undefined) {
// Image has come over before and therefore already been created,
// so don't make a new one!
display_image = images[src];
display_image.onload = function () {
context.drawImage(this, 0, 0);
}
} else {
// Never before seen image, make a new Image()
display_image = new Image();
display_image.onload = function () {
context.drawImage(this, 0, 0);
}
display_image.src = src;
images[src] = display_image; // save it for reuse
}
}
There are more efficient ways to write that (I'm duplicating onload code for instance, and I am not checking to see if an image is already complete). I'll leave those parts up to you though, you get the idea.
you're probably drawing the image a lot more times than you are expecting to. try adding a counter and output the number to an alert or to a div in the page to see how many times the image is being drawn.
That's very interesting. This is worth reporting as a bug to the various browser vendors (my feeling is that it shouldn't happen). You might responses along the lines of "Don't do that, instead do such and such" but at least then you'll know the right answer and have an interesting thing to write up for a blog post (more people will definitely run into this issue).
One thing to try is unsetting the image src (and onload handler) right after the call to drawImage. It might not free up all the memory but it might get most of it back.
If that doesn't work, you could always create a pool of image objects and re-use them once they have drawn to the canvas. That's a hassle because you'll have to track the state of those objects and also set your pool to an appropriate size (or make it grow/shrink based on traffic).
Please report back your results. I'm very interested because I use a similar technique for one of the tightPNG encoding in noVNC (and I'm sure others will be interested too).
I don't believe this is a bug. The problem seems to be that the images are stacked on top of each other. So to clear up the memory, you need to use clearRect() to clear your canvas before drawing the new image in it.
ctx.clearRect(0, 0, canvas.width, canvas.height);
How to clear your canvas matters

HTML5 File API crashes Chrome when using readAsDataURL to load a selected image

Here's my sample code:
var input = document.createElement('input');
input.type = 'file';
document.body.appendChild(input);
input.addEventListener('change', function(){
var file = input.files[0];
var reader = new FileReader();
reader.onload = function(e){
var image = new Image();
image.src = e.target.result;
};
reader.readAsDataURL(file);
});
Load the page, select a large image (I'm using a 2.9MB 4288x3216 image). Refresh the page and select the same image. Result? The tab crashes! (Aw, Snap!)
My guess is that this is a bug with Chrome's implementation of the File API, but I'd love it if someone could confirm that and maybe even offer a workaround. I really want to be able to show a thumbnail of a photo without having to go to the server to generate one (even if it's just for Chrome and FF).
Also, with my sample code above, as soon as you select the photo, the tab starts using about 32MB more of memory. That, I guess, is expected, but what concerns me is that the memory never seems to get freed by the garbage collector. So if I keep selecting more photos, I keep consuming more memory. I don't know if this is related to the crashing issue or not, but it's definitely a concern.
Thanks for any help!
The memory issue could be a result of this bug: http://crbug.com/36142. Essentially, Chrome is caches data: URLs and currently does not release the memory when the img.src is changed. The other issue is that data: URLs yield a 33% overhead to the data you're encoding. That means you're actually setting a ~3.85MB resource on the image, not 2.9MB.
Since you're not manipulating the content (the actual bytes), there's no need to read the file content. One option is to create a blob: url. There's also an explicit revoke method, so you won't run into the same memory caching issues. Something like:
input.addEventListener('change', function(e) {
var file = input.files[0];
window.URL = window.webkitURL || window.URL; // Vendor prefixed in Chrome.
var img = document.createElement('img');
img.onload = function(e) {
window.URL.revokeObjectURL(img.src); // Clean up after yourself.
};
img.src = window.URL.createObjectURL(file);
document.body.appendChild(img);
});

Categories

Resources