Javascript- saving each incoming data directly to disk

Javascript- saving each incoming data directly to disk - javascript

I am making a file sharing app in java-script. I have to send and receive large files which I cannot store in buffer or memory. I want to directly store each coming data chunk directly to disk. Any reference link will be appreciated.

FileWriter is a good place to start, if you only care about Chrome.
Otherwise, an interesting hack is using IndexedDB to store chunks as blobs, because they'll technically be stored to disk, then constructing a big blob out of these chunks, and providing a link to it with URL.createObjectURL. It doesn't involve loading anything into memory, since blobs are just references to data, not the data itself, and in this case, all the data is stored off-memory, inside IndexedDB. The only problem here is the extra copy of all the data.
It's not as nice as FileWriter, but this hack is the only solution to work across many browsers (Safari being a notable exception, as always.)

Related

Why is ReadableStream preferable to Buffer when returning file data from a middleware?

I have a middleware that reads multipart/form-data and returns any files submitted attached to the request body. I use busboy to process the input and return the file contents as a buffer. I read that buffers can consume a lot of memory and hence ReadableStreams are a better choice. I don't understand why this is so: even a stream has to store all the underlying data somewhere in the memory, right? So how can that be better than directly accessing the buffer itself?

yeah. It does store some data, but it's a small fraction of it. Say, you want to upload a file somewhere. With a buffer you would have to potentially read 1GB file and then upload the whole thing at once. If you don't have that 1GB of memory available, or if you're uploading multiple files in the same time, you will simply run out of memory. With streams, you can process data as you read it. So you load 1B of data, you upload it, you free your memory, you load another byte, you upload that etc.
If you're building a server application the idea is the same. You start receiving a file on your server and you can start processing whatever piece of the data you have before the client manages to upload the whole thing. TCP is built in such a way that the client simply cannot upload the stuff faster than you can process it. At no point will you have the entire file in your memory.

Saving a blob in JavaScript, while the blob is still being constructed

There are several solutions for saving a data blob to a file in a browser environment. I normally use FileSaver.js.
In this particular case, though, my data comes in from a websocket stream, and the resulting file might be so large (hundreds of MB) that concatenating it in memory really kills performance.
Is it at all possible to open a stream that data can be written to as it comes in?
Edit:
The websocket is an XMPP connection. The file transfer is negotiated via Stream Initiation (XEP-0095) using the SI File Transfer (XEP-0096) protocol; the data is then transferred via an In Band Bytestream (XEP-0047).
Basically, the web application gets the data in base64-encoded chunks of 4096 bytes, and I'd like to know if there is any way to let the user save the file before all the data is received.

IE6 compatible way to send uploaded file via Javascript to API

Right now users have to two ways to send data to my application: upload a CSV file via browser, or via our API.
It would dramatically reduce duplication to use the API in both instances. Is there a way they could use the existing upload form, but instead of storing the CSV on the server, it would be processed via Javascript and sent to the API?
The solution would unfortunately have to be IE6 compatible.

There is no way to do this with IE6 (and in my opinion there is no reason to even try this).
Possible workarounds:
Embed the uploadform into a iframe (ugly)
Use a FlashPlayer 9+/JavaApplet to perform the upload (requires plugin), but this also gives you the possibility to process the data before sending it.

Unfortunately, no, this is not possible with IE6. Reading local files is possible in HTML5, but IE has a long way to go...
Also- it is not required to store the CSV on the server. You can process the CSV dynamically in a servlet and use the API to store the data without storing the CSV on the server.

coordinating filesystem activity in nodejs

What is the best practice for coordinating access to files in node.js?
I'm trying to write an http based file uploader for very large files (10sGB) that is resumable. I'm trying to figure out what the best approach is to handle two people trying to upload the same file at the same time... I'm also trying to think ahead to the possibility where more than one copy of the node.js http server is running behind a load balancer, which means catching duplicate uploads can't rely on just the code itself.
In python, for example, you can create a file by passing the correct flags to the open() call to force an atomic create. Not sure if the default node.js open new file is atomic.
Another option I thought of, but don't really want to pursue, is using a database with an async driver that supports atomic transactions to track this state...

In order to know if multiple users are uploading the same file, you will have to identify the files somehow. Hashing is best for this. First, hash the entire file on the client side to identify it. Tell the server the hash of the file, if there is already a file on the server with the same hash, then the file has already been uploaded or is currently being uploaded.
Since this is an http file server, you will likely want users to upload files from a browser. You can get the contents of a file with a browser using the File Reader API. Unfortunately as of now this isn't widely supported. You might have to use something like flash to to get it to work in other browsers.
As you stream the file into memory with the file reader, you will want to break it into chunks and hash the chunks. Then send the server all of the file's hashed chunks. It's important that you break the file into chunks and hash those individual chunks instead of the contents of the entire file because otherwise the client could send one hash and upload an entire different file.
After the hashes are received and compared to other files' hashes and it turns out someone else is currently uploading the same file, the server then decides which user gets to upload which chunks of the file. The server then tells the uploading clients what chunks it wants from them, and the clients upload their corresponding chunks.
As each chunk is finished uploading, it is rehashed on the server and compared with the original array of hashes to verify that the user is uploading the correct file.

I found this on HackerNews under a response to someone complaining about some of the same things in node.js. I'll put it here for completeness. This allows me to at least lock some file writes in node.js like I wanted to.
IsaacSchlueter 4 hours ago | link
You can open a file with O_EXCL if you pass in the open flags as a
number. (You can find them on require("constants"), and they need to
be binary-OR'ed together.) This isn't documented. It should be. It
should probably also be exposed in a cleaner way. Most of the rest of
what you describe is APIs that need to be polished and refined a bit.
The boundaries are well defined at this point, though. We probably
won't add another builtin module at this point, or dramatically expand
what any of them can do. (I don't consider seek() dramatic, it's just
tricky to get right given JavaScript's annoying Number problems.)

XForms: how to deal with instance data that changes?

At the moment I am working on an XForms application to mutate XML data. This data comes from a local XML file. The file is exported from another application in a static way and read into the application. The problem is that every time the data changes (the XML structure remains the same). How can I fix this in XForms? I use XSLTForms in my application.

XSLTForms cannot directly access local data files because Javascript is never allowed to do that for security reasons.
For a local only treatment, it's always possible to run a local HTTP server which can be minimal.
If the data file is changing independantly, there is another problem: according to the client-server architecture, XForms can only periodically try to check the file contents.
-Alain

Develop Reference

JavaScript is the programming language of the Web.