Large blob file in Javascript

Large blob file in Javascript - javascript

I have an XHR object that downloads 1GB file.
function getFile(callback)
{
var xhr = new XMLHttpRequest();
xhr.onload = function () {
if (xhr.status == 200) {
callback.apply(xhr);
}else{
console.log("Request error: " + xhr.statusText);
}
};
xhr.open('GET', 'download', true);
xhr.onprogress = updateProgress;
xhr.responseType = "arraybuffer";
xhr.send();
}
But the File API can't load all that into memory even from a worker
it throws out of memory...
btn.addEventListener('click', function() {
getFile(function() {
var worker = new Worker("js/saving.worker.js");
worker.onmessage = function(e) {
saveAs(e.data); // FileSaver.js it creates URL from blob... but its too large
};
worker.postMessage(this.response);
});
});
Web Worker
onmessage = function (e) {
var view = new DataView(e.data, 0);
var file = new File([view], 'file.zip', {type: "application/zip"});
postMessage('file');
};
I'm not trying to compress the file, this file is already compressed from server.
I thought storing it first on indexedDB but i i'll have to load blob or file anyway, even if i do request by range bytes, soon or late i will have to build this giant blob..
I want to create blob: url and send it to user after been downloaded by browser
I'll use FileSystem API for Google Chrome, but i want make something for firefox, i looked into File Handle Api but nothing...
Do i have to build an extension for firefox, in order to do the same thing as FileSystem does for google chrome?
Ubuntu 32 bits

Loading 1gb+ with ajax isn't convenient just for monitoring download progress and filling up the memory.
Instead I would just send the file with a Content-Disposition header to save the file.
There are however ways to go around it to monitor the progress. Option one is to have a second websocket that signals how much you have downloaded while you are downloading normally with a get request. the other option will be described later in the bottom
I know you talked about using Blinks sandboxed filesystem in the conversation. but it has some drawbacks. It may need permission if using persistent storage. It only allows 20% of the available disk that are left. And if chrome needs to free some space then it will throw away any others domains temporary storage that was last used for the most recent file. Beside it doesn't work in private mode.
Not to mention that it has been dropping support for it and may never end up in other browsers - but they will most likely not remove it since many sites still depend on it
The only way to process this large file is with streams. That is why I have created a StreamSaver. This is only going to work in Blink (chrome & opera) ATM but it will eventually be supported by other browsers with the whatwg spec to back it up as a standard.
fetch(url).then(res => {
// One idea is to get the filename from Content-Disposition header...
const size = ~~res.headers.get('Content-Length')
const fileStream = streamSaver.createWriteStream('filename.zip', size)
const writeStream = fileStream.getWriter()
// Later you will be able to just simply do
// res.body.pipeTo(fileStream)
// instead of pumping
const reader = res.body.getReader()
const pump = () => reader.read()
.then(({ value, done }) => {
// here you know how large the value (chunk) is and you can
// figure out the download speed/progress when comparing it to the size
return done
? writeStream.close()
: writeStream.write(value).then(pump)
)
// Start the reader
pump().then(() =>
console.log('Closed the stream, Done writing')
)
})
This will not take up any memory

I have a theory that is if you split the file into chunks and store them in the indexedDB and then later merge them together it will work
A blob isn't made of data... it's more like pointers to where a file can be read from
Meaning if you store them in indexedDB and then do something like this (using FileSaver or alternative)
finalBlob = new Blob([blob_A_fromDB, blob_B_fromDB])
saveAs(finalBlob, 'filename.zip')
But i can't confirm this since i haven't tested it, would be good if someone else could

Blob is cool until you want to download a large file, there is a 600MB limit(chrome) for blob since it stores everything in memory.

Related

Send XMLHttpRequest data in chunks or as ReadableStream to reduce memory usage for large data

I've been trying to use JS's XMLHttpRequest Class for file uploading. I initially tried something like this:
const file = thisFunctionReturnsAFileObject();
const request = new XMLHttpRequest();
request.open('POST', '/upload-file');
const rawFileData = await file.arrayBuffer();
request.send(rawFileData);
The above code works (yay!), and sends the raw binary data of the file to my server.
However...... It uses a TON of memory (because the whole file gets stored in memory, and JS isn't particulary memory friendly)... I found out that on my machine (16GB RAM), I couldn't send files larger than ~100MB, because JS would allocate too much memory, and the Chrome tab would crash with a SIGILL code.
So, I thought it would be a good idea to use ReadableStreams here. It has good enough browser compatibility in my case (https://caniuse.com/#search=ReadableStream) and my TypeScript compiler told me that request.send(...) supports ReadableStreams (I later came to the conclusion that this is false). I ended up with code like this:
const file = thisFunctionReturnsAFileObject();
const request = new XMLHttpRequest();
request.open('POST', '/upload-file');
const fileStream = file.stream();
request.send(fileStream);
But my TypeScript compiler betrayed me (which hurt) and I recieved "[object ReadableStream]" on my server ಠ_ಠ.
I still haven't explored the above method too much, so I'm not sure if there might be a way to do this. I'd also appreciate help on this very much!
Splitting the request in chunk would be an optimal solution, since once a chunk has been sent, we can remove it from memory, before the whole request has even been recieved.
I have searched and searched, but haven't found a way to do this yet (which is why I'm here...). Something like this in pseudocode would be optimal:
const file = thisFunctionReturnsAFileObject();
const request = new XMLHttpRequest();
request.open('POST', '/upload-file');
const fileStream = file.stream();
const fileStreamReader = fileStream.getReader();
const sendNextChunk = async () => {
const chunk = await fileStreamReader.read();
if (!chunk.done) { // chunk.done implies that there is no more data to be read
request.writeToBody(chunk.value); // chunk.value is a Uint8Array
} else {
request.end();
break;
}
}
sendNextChunk();
I'd like to expect this code to send the request in chunks and end the request when all chunks are sent.
The most helpful resource I tried, but didn't work:
Method for streaming data from browser to server via HTTP
Didn't work because:
I need the solution to work in a single request
I can't use RTCDataChannel, it must be in a plain HTTP request (is there an other way to do this than XMLHttpRequest?)
I need it to work in modern Chrome/Firefox/Edge etc. (no IE support is fine)
Edit: I don't want to use multipart-form (FormData Class). I want to send actual binary data read from the filestream in chunks.

You can't do this with XHR afaik. But the more modern fetch API does support passing a ReadableStream for the request body. In your case:
const file = thisFunctionReturnsAFileObject();
const response = await fetch('/upload-file', {
method: 'POST',
body: file.stream(),
});
However, I'm not certain whether this will actually use chunked encoding.

You are facing a Chrome bug where they do set an hard-limit of 256MB to the size of the ArrayBuffer that can be sent.
But anyway, sending an ArrayBuffer will create a copy of the data, so you should rather send your data as a File directly, since this will only read the File exactly like you wanted it to be, as a stream by small chunks.
So taking your first code block that would give
const file = thisFunctionReturnsAFileObject();
const request = new XMLHttpRequest();
request.open('POST', '/upload-file');
request.send(file);
Ans this will work in Chrome too, even with few Gigs files. The only limit you would face here would be before, when you'd do whatever processing you are doing on that File.
Regarding posting ReadableStreams, this will eventually come, but as of today July the 13th of 2020, only Chrome has started working on its implementation, and we web-devs still can't play with it, and specs are still having hard times to come with something stable.
But it's not a problem for you, since you would not win anything doing so anyway. Posting a ReadableStream made from a static File is useless, both fetch and xhr will do this internally already.

Blob name issue with new tab in chrome and firefox [duplicate]

In my Vue app I receive a PDF as a blob, and want to display it using the browser's PDF viewer.
I convert it to a file, and generate an object url:
const blobFile = new File([blob], `my-file-name.pdf`, { type: 'application/pdf' })
this.invoiceUrl = window.URL.createObjectURL(blobFile)
Then I display it by setting that URL as the data attribute of an object element.
<object
:data="invoiceUrl"
type="application/pdf"
width="100%"
style="height: 100vh;">
</object>
The browser then displays the PDF using the PDF viewer. However, in Chrome, the file name that I provide (here, my-file-name.pdf) is not used: I see a hash in the title bar of the PDF viewer, and when I download the file using either 'right click -> Save as...' or the viewer's controls, it saves the file with the blob's hash (cda675a6-10af-42f3-aa68-8795aa8c377d or similar).
The viewer and file name work as I'd hoped in Firefox; it's only Chrome in which the file name is not used.
Is there any way, using native Javascript (including ES6, but no 3rd party dependencies other than Vue), to set the filename for a blob / object element in Chrome?
[edit] If it helps, the response has the following relevant headers:
Content-Type: application/pdf; charset=utf-8
Transfer-Encoding: chunked
Content-Disposition: attachment; filename*=utf-8''Invoice%2016246.pdf;
Content-Description: File Transfer
Content-Encoding: gzip

Chrome's extension seems to rely on the resource name set in the URI, i.e the file.ext in protocol://domain/path/file.ext.
So if your original URI contains that filename, the easiest might be to simply make your <object>'s data to the URI you fetched the pdf from directly, instead of going the Blob's way.
Now, there are cases it can't be done, and for these, there is a convoluted way, which might not work in future versions of Chrome, and probably not in other browsers, requiring to set up a Service Worker.
As we first said, Chrome parses the URI in search of a filename, so what we have to do, is to have an URI, with this filename, pointing to our blob:// URI.
To do so, we can use the Cache API, store our File as Request in there using our URL, and then retrieve that File from the Cache in the ServiceWorker.
Or in code,
From the main page
// register our ServiceWorker
navigator.serviceWorker.register('/sw.js')
.then(...
...
async function displayRenamedPDF(file, filename) {
// we use an hard-coded fake path
// to not interfere with legit requests
const reg_path = "/name-forcer/";
const url = reg_path + filename;
// store our File in the Cache
const store = await caches.open( "name-forcer" );
await store.put( url, new Response( file ) );
const frame = document.createElement( "iframe" );
frame.width = 400
frame.height = 500;
document.body.append( frame );
// makes the request to the File we just cached
frame.src = url;
// not needed anymore
frame.onload = (evt) => store.delete( url );
}
In the ServiceWorker sw.js
self.addEventListener('fetch', (event) => {
event.respondWith( (async () => {
const store = await caches.open("name-forcer");
const req = event.request;
const cached = await store.match( req );
return cached || fetch( req );
})() );
});
Live example (source)
Edit: This actually doesn't work in Chrome...
While it does set correctly the filename in the dialog, they seem to be unable to retrieve the file when saving it to the disk...
They don't seem to perform a Network request (and thus our SW isn't catching anything), and I don't really know where to look now.
Still this may be a good ground for future work on this.
And an other solution, I didn't took the time to check by myself, would be to run your own pdf viewer.
Mozilla has made its js based plugin pdf.js available, so from there we should be able to set the filename (even though once again I didn't dug there yet).
And as final note, Firefox is able to use the name property of a File Object a blobURI points to.
So even though it's not what OP asked for, in FF all it requires is
const file = new File([blob], filename);
const url = URL.createObjectURL(file);
object.data = url;

In Chrome, the filename is derived from the URL, so as long as you are using a blob URL, the short answer is "No, you cannot set the filename of a PDF object displayed in Chrome." You have no control over the UUID assigned to the blob URL and no way to override that as the name of the page using the object element. It is possible that inside the PDF a title is specified, and that will appear in the PDF viewer as the document name, but you still get the hash name when downloading.
This appears to be a security precaution, but I cannot say for sure.
Of course, if you have control over the URL, you can easily set the PDF filename by changing the URL.

I believe Kaiido's answer expresses, briefly, the best solution here:
"if your original URI contains that filename, the easiest might be to simply make your object's data to the URI you fetched the pdf from directly"
Especially for those coming from this similar question, it would have helped me to have more description of a specific implementation (working for pdfs) that allows the best user experience, especially when serving files that are generated on the fly.
The trick here is using a two-step process that perfectly mimics a normal link or button click. The client must (step 1) request the file be generated and stored server-side long enough for the client to (step 2) request the file itself. This requires you have some mechanism supporting unique identification of the file on disk or in a cache.
Without this process, the user will just see a blank tab while file-generation is in-progress and if it fails, then they'll just get the browser's ERR_TIMED_OUT page. Even if it succeeds, they'll have a hash in the title bar of the PDF viewer tab, and the save dialog will have the same hash as the suggested filename.
Here's the play-by-play to do better:
You can use an anchor tag or a button for the "download" or "view in browser" elements
Step 1 of 2 on the client: that element's click event can make a request for the file to be generated only (not transmitted).
Step 1 of 2 on the server: generate the file and hold on to it. Return only the filename to the client.
Step 2 of 2 on the client:
If viewing the file in the browser, use the filename returned from the generate request to then invoke window.open('view_file/<filename>?fileId=1'). That is the only way to indirectly control the name of the file as shown in the tab title and in any subsequent save dialog.
If downloading, just invoke window.open('download_file?fileId=1').
Step 2 of 2 on the server:
view_file(filename, fileId) handler just needs to serve the file using the fileId and ignore the filename parameter. In .NET, you can use a FileContentResult like File(bytes, contentType);
download_file(fileId) must set the filename via the Content-Disposition header as shown here. In .NET, that's return File(bytes, contentType, desiredFilename);
client-side download example:
download_link_clicked() {
// show spinner
ajaxGet(generate_file_url,
{},
(response) => {
// success!
// the server-side is responsible for setting the name
// of the file when it is being downloaded
window.open('download_file?fileId=1', "_blank");
// hide spinner
},
() => { // failure
// hide spinner
// proglem, notify pattern
},
null
);
client-side view example:
view_link_clicked() {
// show spinner
ajaxGet(generate_file_url,
{},
(response) => {
// success!
let filename = response.filename;
// simplest, reliable method I know of for controlling
// the filename of the PDF when viewed in the browser
window.open('view_file/'+filename+'?fileId=1')
// hide spinner
},
() => { // failure
// hide spinner
// proglem, notify pattern
},
null
);

I'm using the library pdf-lib, you can click here to learn more about the library.
I solved part of this problem by using api Document.setTitle("Some title text you want"),
Browser displayed my title correctly, but when click the download button, file name is still previous UUID. Perhaps there is other api in the library that allows you to modify download file name.

How can I get multiple files to upload to the server from a Javascript page without skipping?

I'm working on a research experiment which uses getUserMedia, implemented in recorder.js, to record .wav files from the user's microphone and XMLHttpRequest to upload them to the server. Each file is about 3 seconds long and there are 36 files in total. The files are recorded one after another and sent to the server as soon as they are recorded.
The problem I'm experiencing is that not all of the files end up on the server. Apparently the script or the php script are unable to catch up with all the requests in a row. How can I make sure that I get all the files? These are important research data, so I need every recording.
Here's the code that sends the files to the server. The audio data is a blob:
var filename = subjectID + item__number;
xhr.onload=function(e) {
if(this.readyState === 4) {
console.log("Server returned: ",e.target.responseText);
}
};
var fd=new FormData();
fd.append("audio_data",blob, filename);
xhr.open("POST","upload_wav.php",true);
xhr.send(fd);
And this is the php file on the server side:
print_r($_FILES);
$input = $_FILES['audio_data']['tmp_name'];
$output = "audio/".$_FILES['audio_data']['name'].".wav";
move_uploaded_file($input, $output)
This way of doing things is basically copied from this website:
Using Recorder.js to capture WAV audio in HTML5 and upload it to your server or download locally
I have already tried making the XMLHttpRequest wait by using
while (xhr.readyState != 4)
{
console.log("Waiting for server...")
}
It just caused the page to hang.
Would it be better to use ajax than XMLHttp Request? Is there something I can do to make sure that all the files get uploaded? I'm pretty new to Javascript so code examples are appreciated.

I have no idea what your architecture looks like, but here is a potential solution that will work to solve your problem.
The solution uses the Web Worker API to off load the file uploading to a sub-process. This is done with the Worker Interface of that API. This approach will work because there is no contention of the single thread of the main process - web workers work in their own processes.
Using this approach, we do three basic things:
create a new worker passing a script to execute
pass messages to the worker for the worker to deal with
pass messages back to the main process for status updates/replies/resolved data transformation/etc.
The code is heavily commented below to help you understand what is happening and where.
This is the main JavaScript file (script.js)
// Create a sub process to handle the file uploads
///// STEP 1: create a worker and execute the worker.js file immediately
let worker = new Worker('worker.js');
// Ficticious upload count for demonstration
let uploadCount = 12;
// repeatedly build and send files every 700ms
// This is repeated until uplaodCount == 0
let builder = setInterval(buildDetails, 700);
// Recieve message from the sub-process and pipe them to the view
///// STEP 2: listen for messages from the worker and do something with them
worker.onmessage = e => {
let p = document.createElement('pre');
// e.data represents the message data sent from the sub-process
p.innerText = e.data;
document.body.appendChild(p);
};
/**
* Sort of a mock to build up your BLOB (fake here of-course)
*
* Post the data needed for the FormData() to the worker to handle.
*/
function buildDetails() {
let filename = 'subject1234';
let blob = new Blob(['1234']);
///// STEP 3: Send a message to the worker with file details
worker.postMessage({
name: "audio_data",
blob: blob,
filename: filename
});
// Decrease the count
uploadCount--;
// if count is zero (== false) stop the fake process
if (!uploadCount) clearInterval(builder);
}
This is the sub-process JavaScript file (worker.js)
// IGNORE the 'fetch_mock.js' import that is only here to avoid having to stand up a server
// FormDataPolyFill.js is needed in browsers that don't yet support FormData() in workers
importScripts('FormDataPolyFill.js', 'fetch_mock.js');
// RXJS provides a full suite of asynchronous capabilities based around Reactive Programming (nothing to do with ReactJS);
// The need for your use case is that there are guarantees that the stream of inputs will all be processed
importScripts('https://cdnjs.cloudflare.com/ajax/libs/rxjs/6.3.3/rxjs.umd.js');
// We create a "Subject" that acts as a vessel for our files to upload
let forms = new rxjs.Subject();
// This says "every time the forms Subject is updated, run the postfile function and send the next item from the stream"
forms.subscribe(postFile);
// Listen for messages from the main process and run doIt each time a message is recieved
onmessage = doIt;
/**
* Takes an event object containing the message
*
* The message is presumably the file details
*/
function doIt(e) {
var fd = new FormData();
// e.data represents our details object with three properties
fd.append(e.data.name, e.data.blob, e.data.filename);
// Now, place this FormData object into our stream of them so it can be processed
forms.next(fd);
}
// Instead of using XHR, this uses the newer fetch() API based upon Promises
// https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API
function postFile(fd) {
// Post the file to the server (This is blocked in fetch_mock.js and doesn't go anywhere)
fetch('fake', {
method: 'post',
body: fd,
})
.then((fd) => {
// After the XHR request is complete, 'Then' post a message back to the main thread (If there is a need);
postMessage("sent: " + JSON.stringify(fd));
});
}
Since this will not run in stackoverflow, I've created a plunker so that you can run this example:
http://plnkr.co/edit/kFY6gcYq627PZOATXOnk
If all this seems complicated, you've presented a complicated problem to solve. :-)
Hope this helps.

How to post blob and string with XMLHttpRequest javascript and PHP [duplicate]

I've seen many partial answers to this here and elsewhere, but I am very much a novice coder and am hoping for a thorough solution. I have been able to set up recording audio from a laptop mic in Chrome Canary (v. 29.x) and can, using recorder.js, relatively easily set up recording a .wav file and saving that locally, a la:
http://webaudiodemos.appspot.com/AudioRecorder/index.html
But I need to be able to save the file onto a Linux server I have running. It's the actual sending of the blob recorded data to the server and saving it out as a .wav file that's catching me up. I don't have the requisite PHP and/or AJAX knowledge about how to save the blob to a URL and to deal, as I have been given to understand, with binaries on Linux that make saving that .wav file challenging indeed. I'd greatly welcome any pointers in the right direction.

Client side JavaScript function to upload the WAV blob:
function upload(blob) {
var xhr=new XMLHttpRequest();
xhr.onload=function(e) {
if(this.readyState === 4) {
console.log("Server returned: ",e.target.responseText);
}
};
var fd=new FormData();
fd.append("that_random_filename.wav",blob);
xhr.open("POST","<url>",true);
xhr.send(fd);
}
PHP file upload_wav.php:
<?php
// get the temporary name that PHP gave to the uploaded file
$tmp_filename=$_FILES["that_random_filename.wav"]["tmp_name"];
// rename the temporary file (because PHP deletes the file as soon as it's done with it)
rename($tmp_filename,"/tmp/uploaded_audio.wav");
?>
after which you can play the file /tmp/uploaded_audio.wav.
But remember! /tmp/uploaded_audio.wav was created by the user www-data, and (by PHP default) is not readable by the user. To automate adding the appropriate permissions, append the line
chmod("/tmp/uploaded_audio.wav",0755);
to the end of the PHP (before the PHP end tag ?>).
Hope this helps.

Easiest way, if you just want to hack that code, is go in to recorderWorker.js, and hack the exportWAV() function to something like this:
function exportWAV(type){
var bufferL = mergeBuffers(recBuffersL, recLength);
var bufferR = mergeBuffers(recBuffersR, recLength);
var interleaved = interleave(bufferL, bufferR);
var dataview = encodeWAV(interleaved);
var audioBlob = new Blob([dataview], { type: type });
var xhr=new XMLHttpRequest();
xhr.onload=function(e) {
if(this.readyState === 4) {
console.log("Server returned: ",e.target.responseText);
}
};
var fd=new FormData();
fd.append("that_random_filename.wav",audioBlob);
xhr.open("POST","<url>",true);
xhr.send(fd);
}
Then that method will save to server from inside the worker thread, rather than pushing it back to the main thread. (The complex Worker-based mechanism in RecorderJS is because a large encode should be done off-thread.)
Really, ideally, you'd just use a MediaRecorder today, and let it do the encoding, but that's a whole 'nother ball of wax.

How to pass a blob from a Chrome extension to a Chrome app

A Little Background
I've been working for a couple of days on a Chrome extension that takes a screenshot of given web pages multiple times a day. I used this as a guide and things work as expected.
There's one minor requirement extensions can't meet, though. The user must have access to the folder where the images (screenshots) are saved but Chrome Extensions don't have access to the file system. Chrome Apps, on the other hand, do. Thus, after much looking around, I've concluded that I must create both a Chrome Extension and a Chrome App. The idea is that the extension would create a blob of the screenshot and then send that blob to the app which would then save it as an image to a user-specified location. And that's exactly what I'm doing — I'm creating a blob of the screentshot on the extension side and then sending it over to the app where the user is asked to choose where to save the image.
The Problem
Up to the saving part, everything works as expected. The blob is created on the extension, sent over to the app, received by the app, the user is asked where to save, and the image is saved.... THAT is where things fall apart. The resulting image is unusable. When I try to open it, I get a message that says "Can't determine type". Below is the code I'm using:
First ON THE EXTENSION side, I create a blob and send it over, like this:
chrome.runtime.sendMessage(
APP_ID, /* I got this from the app */
{myMessage: blob}, /* Blob created previously; it's correct */
function(response) {
appendLog("response: "+JSON.stringify(response));
}
);
Then, ON THE APP side, I receive the blob and attempt to save it like this:
// listen for external messages
chrome.runtime.onMessageExternal.addListener(
function(request, sender, sendResponse) {
if (sender.id in blacklistedIds) {
sendResponse({"result":"sorry, could not process your message"});
return; // don't allow this extension access
} else if (request.incomingBlob) {
appendLog("from "+sender.id+": " + request.incomingBlob);
// attempt to save blob to choosen location
if (_folderEntry == null) {
// get a directory to save in if not yet chosen
openDirectory();
}
saveBlobToFile(request.incomingBlob, "screenshot.png");
/*
// inspect object to try to see what's wrong
var keys = Object.keys(request.incomingBlob);
var keyString = "";
for (var key in keys) {
keyString += " " + key;
}
appendLog("Blob object keys:" + keyString);
*/
sendResponse({"result":"Ok, got your message"});
} else {
sendResponse({"result":"Ops, I don't understand this message"});
}
}
);
Here's the function ON THE APP that performs the actual save:
function saveBlobToFile(blob, fileName) {
appendLog('entering saveBlobToFile function...');
chrome.fileSystem.getWritableEntry(_folderEntry, function(entry) {
entry.getFile(fileName, {create: true}, function(entry) {
entry.createWriter(function(writer) {
//writer.onwrite = function() {
// writer.onwrite = null;
// writer.truncate(writer.position);
//};
appendLog('calling writer.write...');
writer.write(blob);
// Also tried writer.write(new Blob([blob], {type: 'image/png'}));
});
});
});
}
There are no errors. No hiccups. The code works but the image is useless. What exactly am I missing? Where am I going wrong? Is it that we can only pass strings between extensions/apps? Is the blob getting corrupted on the way? Does my app not have access to the blob because it was created on the extension? Can anyone please shed some light?
UPDATE (9/23/14)
Sorry for the late update, but I was assigned to a different project and could not get back to this until 2 days ago.
So after much looking around, I've decided to go with #Danniel Herr's suggestion which suggests to use a SharedWorker and a page embedded in a frame in the app. The idea is that the Extension would supply the blob to the SharedWorker, which forwards the blob to a page in the extension that is embedded in a frame in the app. That page, then forwards the blob to the app using parent.postMessage(...). It's a bit cumbersome but it seems it's the only option I have.
Let me post some code so that it makes a bit more sense:
Extension:
var worker = new SharedWorker(chrome.runtime.getURL('shared-worker.js'));
worker.port.start();
worker.postMessage('hello from extension'); // Can send blob here too
worker.port.addEventListener("message", function(event) {
$('h1Title').innerHTML = event.data;
});
proxy.js
var worker = new SharedWorker(chrome.runtime.getURL('shared-worker.js'));
worker.port.start();
worker.port.addEventListener("message",
function(event) {
parent.postMessage(event.data, 'chrome-extension://[extension id]');
}
);
proxy.html
<script src='proxy.js'></script>
shared-worker.js
var ports = [];
var count = 0;
onconnect = function(event) {
count++;
var port = event.ports[0];
ports.push(port);
port.start();
/*
On both the extension and the app, I get count = 1 and ports.length = 1
I'm running them side by side. This is so maddening!!!
What am I missing?
*/
var msg = 'Hi, you are connection #' + count + ". ";
msg += " There are " + ports.length + " ports open so far."
port.postMessage(msg);
port.addEventListener("message",
function(event) {
for (var i = 0; i < ports.length; ++i) {
//if (ports[i] != port) {
ports[i].postMessage(event.data);
//}
}
});
};
On the app
context.addEventListener("message",
function(event) {
appendLog("message from proxy: " + event.data);
}
);
So this is the execution flow... On the extension I create a shared worker and send a message to it. The shared worker should be capable of receiving a blob but for testing purposes I'm only sending a simple string.
Next, the shared worker receives the message and forwards it to everyone who has connected. The proxy.html/js which is inside a frame in the app has indeed connected at this point and should receive anything forwarded by the shared worker.
Next, proxy.js [should] receives the message from the shared worker and sends it to the app using parent.postMessage(...). The app is listening via a window.addEventListener("message",...).
To test this flow, I first open the app, then I click the extension button. I get no message on the app. I get no errors either.
The extension can communicate back and forth with the shared worker just fine. The app can communicate with the shared worker just fine. However, the message I sent from the extension->proxy->app does not reach the app. What am I missing?
Sorry for the long post guys, but I'm hoping someone will shed some light as this is driving me insane.
Thanks

Thanks for all your help guys. I found the solution to be to convert the blob into a binary string on the extension and then send the string over to the app using chrome's message passing API. On the app, I then did what Francois suggested to convert the binary string back a blob. I had tried this solution before but I had not worked because I was using the following code on the app:
blob = new Blob([blobAsBinString], {type: mimeType});
That code may work for text files or simple strings, but it fails for images (perhaps due to character encoding issues). That's where I was going insane. The solution is to use what Francois provided since the beginning:
var bytes = new Uint8Array(blobAsBinString.length);
for (var i=0; i<bytes.length; i++) {
bytes[i] = blobAsBinString.charCodeAt(i);
}
blob = new Blob([bytes], {type: mimeString});
That code retrains the integrity of the binary string and the blob is recreated properly on the app.
Now I also incorporated something I found suggested by some of you here and RobW elsewhere, which is to split the blob into chunks and send it over like that, in case the blob is too large. The entire solution is below:
ON THE EXTENSION:
function sendBlobToApp() {
// read the blob in chunks/chunks and send it to the app
// Note: I crashed the app using 1 KB chunks. 1 MB chunks work just fine.
// I decided to use 256 KB as that seems neither too big nor too small
var CHUNK_SIZE = 256 * 1024;
var start = 0;
var stop = CHUNK_SIZE;
var remainder = blob.size % CHUNK_SIZE;
var chunks = Math.floor(blob.size / CHUNK_SIZE);
var chunkIndex = 0;
if (remainder != 0) chunks = chunks + 1;
var fr = new FileReader();
fr.onload = function() {
var message = {
blobAsText: fr.result,
mimeString: mimeString,
chunks: chunks
};
// APP_ID was obtained elsewhere
chrome.runtime.sendMessage(APP_ID, message, function(result) {
if (chrome.runtime.lastError) {
// Handle error, e.g. app not installed
// appendLog is defined elsewhere
appendLog("could not send message to app");
}
});
// read the next chunk of bytes
processChunk();
};
fr.onerror = function() { appendLog("An error ocurred while reading file"); };
processChunk();
function processChunk() {
chunkIndex++;
// exit if there are no more chunks
if (chunkIndex > chunks) {
return;
}
if (chunkIndex == chunks && remainder != 0) {
stop = start + remainder;
}
var blobChunk = blob.slice(start, stop);
// prepare for next chunk
start = stop;
stop = stop + CHUNK_SIZE;
// convert chunk as binary string
fr.readAsBinaryString(blobChunk);
}
}
ON THE APP
chrome.runtime.onMessageExternal.addListener(
function(request, sender, sendResponse) {
if (sender.id in blacklistedIds) {
return; // don't allow this extension access
} else if (request.blobAsText) {
//new chunk received
_chunkIndex++;
var bytes = new Uint8Array(request.blobAsText.length);
for (var i=0; i<bytes.length; i++) {
bytes[i] = request.blobAsText.charCodeAt(i);
}
// store blob
_blobs[_chunkIndex-1] = new Blob([bytes], {type: request.mimeString});
if (_chunkIndex == request.chunks) {
// merge all blob chunks
for (j=0; j<_blobs.length; j++) {
var mergedBlob;
if (j>0) {
// append blob
mergedBlob = new Blob([mergedBlob, _blobs[j]], {type: request.mimeString});
}
else {
mergedBlob = new Blob([_blobs[j]], {type: request.mimeString});
}
}
saveBlobToFile(mergedBlob, "myImage.png", request.mimeString);
}
}
}
);

Does my app not have access to the blob because it was created on the
extension? Can anyone please shed some light?
Exactly! You may want to pass a dataUrl instead of a blob. Something like this below could work:
/* Chrome Extension */
var blobToDataURL = function(blob, cb) {
var reader = new FileReader();
reader.onload = function() {
var dataUrl = reader.result;
var base64 = dataUrl.split(',')[1];
cb(base64);
};
reader.readAsDataURL(blob);
};
blobToDataUrl(blob, function(dataUrl) {
chrome.runtime.sendMessage(APP_ID, {databUrl: dataUrl}, function() {});
});
/* Chrome App */
function dataURLtoBlob(dataURL) {
var byteString = atob(dataURL.split(',')[1]),
mimeString = dataURL.split(',')[0].split(':')[1].split(';')[0];
var ab = new ArrayBuffer(byteString.length);
var ia = new Uint8Array(ab);
for (var i = 0; i < byteString.length; i++) {
ia[i] = byteString.charCodeAt(i);
}
var blob = new Blob([ia], {type: mimeString});
return blob;
}
chrome.runtime.onMessageExternal.addListener(
function(request) {
var blob = dataURLtoBlob(request.dataUrl);
saveBlobToFile(blob, "screenshot.png");
});

I am extremely interested in this question, as I am trying to accomplish something similar.
these are the questions that I have found to be related:
How can a Chrome extension save many files to a user-specified directory?
Implement cross extension message passing in chrome extension and app
Does chrome.runtime support posting messages with transferable objects?
Pass File object to background.js from content script or pass createObjectURL (and keep alive after refresh)
According to Rob W, in the first link:
"Chrome's fileSystem (app) API can directly write to the user's filesystem (e.g. ~/Documents or %USERPROFILE%\Documents), specified by the user."
If you can write to a user's filesystem you should be able to read from it right?
I haven't had the opportunity to try this out, but instead of directly passing the file blob to the app, you could save the item to your downloads using the chrome extension downloads api.
Then you could retrieve it with the chrome app filesystem api to gain access to it.
Edit:
I keep reading that the filesystem the api can access is sandboxed. So I have no idea if this solution is possible. It being sandboxed and Rob W's description of "writing directly to the user's filesystem" sound like opposites to me.
Edit:
Rob W has revised his answer here: Implement cross extension message passing in chrome extension and app.
It no longer uses a shared worker, and passes file data as a string to the backend, which can turn the string back into a blob.
I'm not sure what the max length of a message is, but Rob W also mentions a solution for slicing up blobs to send them in pieces.
Edit:
I have sent 43 mbs of data without crashing my app.

That's really an intresting question. From my point of view it can be done using these techniques:
First of all you should convert your blob to arraybuffer. This can be done with FileReader, and it is async operation
Then here comes some magic of Encoding API, which is currently available on stable Chrome. So you convert your arraybuffer into string. This operation is sync
Then you can communicate with other extensions/apps using Chrome API like this. I am using this technique to promote one of my apps (new packaged app) using another famous legacy app. And due to the fact that legacy packaged apps are in fact extensions, I think everything will be okay.

Develop Reference

JavaScript is the programming language of the Web.