There are several solutions for saving a data blob to a file in a browser environment. I normally use FileSaver.js.
In this particular case, though, my data comes in from a websocket stream, and the resulting file might be so large (hundreds of MB) that concatenating it in memory really kills performance.
Is it at all possible to open a stream that data can be written to as it comes in?
Edit:
The websocket is an XMPP connection. The file transfer is negotiated via Stream Initiation (XEP-0095) using the SI File Transfer (XEP-0096) protocol; the data is then transferred via an In Band Bytestream (XEP-0047).
Basically, the web application gets the data in base64-encoded chunks of 4096 bytes, and I'd like to know if there is any way to let the user save the file before all the data is received.
Related
I have a middleware that reads multipart/form-data and returns any files submitted attached to the request body. I use busboy to process the input and return the file contents as a buffer. I read that buffers can consume a lot of memory and hence ReadableStreams are a better choice. I don't understand why this is so: even a stream has to store all the underlying data somewhere in the memory, right? So how can that be better than directly accessing the buffer itself?
yeah. It does store some data, but it's a small fraction of it. Say, you want to upload a file somewhere. With a buffer you would have to potentially read 1GB file and then upload the whole thing at once. If you don't have that 1GB of memory available, or if you're uploading multiple files in the same time, you will simply run out of memory. With streams, you can process data as you read it. So you load 1B of data, you upload it, you free your memory, you load another byte, you upload that etc.
If you're building a server application the idea is the same. You start receiving a file on your server and you can start processing whatever piece of the data you have before the client manages to upload the whole thing. TCP is built in such a way that the client simply cannot upload the stuff faster than you can process it. At no point will you have the entire file in your memory.
I have a use case where in I need to provide a file of the size of 500 MB as a downloadable to users in the browser. The file itself is stored in an S3 bucket in the backend. As an alternate to the approach of fetching the file with a single GET S3 call, I am exploring an approach to make multiple ranged based GET calls to S3 such that the total data fetched by the ranged GET calls can be concatenated together to recreate the original file.
As the file size is on a higher side, I won’t be making all S3 ranged GET calls together such that the entire file data[from the GET call responses] is never loaded in browser memory.
Instead I plan to execute the S3 ranged GET calls in batches such that the browser memory consumed by the downloads is equal to the size of data fetched by that batch.
As a next step I am looking for a way to write the data from the GET calls batch directly to the file system [instead of maintaining them as a blob in browser memory]. The write operation should be such that data from all the batches is concatenated and stored as a single file on the user’s file system.
Are there any known ways to write data to file system in such a manner?
I was able to set up an Arduino to stream audio from a microphone into a linux server that's hosting an MQTT server. I then have a golang script that subscribes to the MQTT server, saves payload to disk as a binary file, and converts the binary file to a .WAV file with FFMPEG.
Is it possible to have a web browser use only client side code to subscribe to the same MQTT server, receive the audio payload from the Arduino, and stream the audio in near-real-time to the human listener's computer speakers? I see a Paho Javascript Client library that can help me connect to MQTT, but it seems to receive payloads as string, which isn't evident to me on how I'd stream audio content with. Hence, why I'm asking if it's even practical/feasible?
Or will I need to build another server-side script to stream MQTT data as audio data for a web client?
To ensure it works in all environments, ensure that you use MQTT over WebSocket to connect to the server.
Here is a discussion of this: Can a web browser use MQTT?
Look closer at the paho doc, there is a fiction to get the message payload as binary data using the message.payloadBytes field.
payloadBytes | ArrayBuffer | read only The payload as an ArrayBuffer
An example is described here:
https://www.hardill.me.uk/wordpress/2014/08/29/unpacking-binary-data-from-mqtt-in-javascript/
But basically you end up withan ArrayBuffer holding the binary data that you can then coonvert to a Typed Array and read back values at what ever offset you need.
I am making a file sharing app in java-script. I have to send and receive large files which I cannot store in buffer or memory. I want to directly store each coming data chunk directly to disk. Any reference link will be appreciated.
FileWriter is a good place to start, if you only care about Chrome.
Otherwise, an interesting hack is using IndexedDB to store chunks as blobs, because they'll technically be stored to disk, then constructing a big blob out of these chunks, and providing a link to it with URL.createObjectURL. It doesn't involve loading anything into memory, since blobs are just references to data, not the data itself, and in this case, all the data is stored off-memory, inside IndexedDB. The only problem here is the extra copy of all the data.
It's not as nice as FileWriter, but this hack is the only solution to work across many browsers (Safari being a notable exception, as always.)
When uploading a large file to a web server for processing, the upload itself can take a long time, and unless the file can be processed sequentially, the server has to wait until the entire file has been received before processing can begin.
Is it possible to make some javascript "glue" that lets the web server request specific ranges of data from a file on the client's computer as necessary? What I'm asking for is in a way the same as HTTP Range capability, only the server would be generating the requests and javascript on the client computer would be serving the data.
By doing this, the web server could process files as they're being uploaded, for example video files with important information in the footer, or compressed archives. The web server could unpack parts of a zip file without uploading the whole archive, and could even check that the zip file structure is correct before a large file is uploaded. In fact, any generic processing of a large file that can't be read sequentially could be done at the same time as the file is being transferred, instead of first uploading the file and then processing it later. Is it possible to do something like this from the browser without deploying something like java or flash?
If it's possible to read bytes as necessary from the file, it's also conceivable that the web server wouldn't need to have space to store the entire file on a local drive, but simply access the file directly from the client's drive and process it in memory. This would probably be inefficient for some use cases, but the possibility is interesting.