Node.js - Pipe a file stream from one server to another - javascript

I'm trying to use Node.js to get a file from a remote URL then send it to another server (using an API provided by each of the two websites). I already managed to successfully upload a local file to the remote server using fs.createReadStream("file.png"). Remote files seem to be a different story however: I can't simply put "https://website.com/file.png" in there, I need an equivalent for createReadStream for remote files.
Obviously I could use a separate command to download the file locally and upload it using createReadStream then delete the local file, but I want my code to be efficient and not rely on manually downloading temporary files, plus this is a good learning experience. I'd thus like to know the simplest way to pipe files as streams between two different servers.
Also I would like to avoid using extra dependencies if possible, as I'm writing a simple script which I'd rather not make reliant on too many npm packages. I rely on require("https") and require("fs") primarily. I'm curious if this can be achieved through a simple https.get() call.

Related

Is it possible to get JSON from a URL and save it as json file with a custom name inside a specified path/folder through the front end?

My issue is simple on paper - I have a React Native project and I'm trying to make a script that will run on build and fetch JSON from a URL, then save it as JSON file with a custom name inside a specified path/folder. For example, I want to access the JSON at www.example.com and save it as a JSON file in project/src/assets/locales/en.json
The problem is that when I google how to fetch and save JSON, most of the results are related to Node.js and I don't think I can use them. Is what I'm trying to do possible?
Since you've said you're doing this in a git push hook, you can use any of several things for this:
You can indeed use Node.js, by having the hook run it via node, and then using Node.js's http client (to read it) and fs module (to write it to a file).
You can use a shell script, perhaps using with curl or wget (on platforms where those are available).
Since you're already writing JavaScript code for the app itself, Node.js is a reasonable choice.

To stream file data directly from Dropbox to webpage

I want to stream data from my Dropbox to webpage in real time, but don't know how to do it.
It's usually a bad idea because Dropbox can throttle speed, stop sharing file when using from many locations.
You can install Dropbox to your server and sync some folder with your Dropbox:
https://www.dropbox.com/install
And to stream from your local folder is easier task.
But if you really need to get files from Dropbox real-time, you can use their API. They've got libraries for many languages. For example this one is for PHP, also tutorial there:
https://www.dropbox.com/developers-v1/core/start/php

Node.js expressjs fastest way to store and display a html page

I want to display a static HTML page with expressjs. What's the fastest way to do so? Currently I have a html file stored in the file system and I'm using res.sendfile(home + "/public/file.xml")
Is sendfile the fastest way to do it or might storing the file in some kind of database be faster?
Thanks
Storing files in database? Well, some people try to do such evil things. But what is the point? I mean database engine has to read the file from disk as well. Additionally has to do many other things making the loading process longer.
Now you seem to confuse some things. Sending files is the same thing as sending any other response. At some point you have to send the binary data, so you cannot avoid using res.send (or any other form of it). This is of course if we are talking about Node.js, because you might consider serving files via nginx, which is probably the best static file server there is.
One last thing: if you still want to serve files via Node.JS, then you may consider caching them in memory. This of course depends on how many and how big files you want to serve. This is as easy as creating global variable which holds the data and sends it whenever there is a request. That's the only real performance boost you can make with Node.JS.
Express has support for static file serving: app.use(express.static(__dirname + '/public'));.

Download one file, with pieces stored on more than one server (HTTP)

I am working on a file upload system which will store individual parts of large files on more than one server. So the distribution of a 1GB file will look something like this:
Server 1: 0-128MB
Server 2: 128MB-256MB
Server 2: 256MB-384MB
... etc
The intention of this is to allow for redundancy (each part will exist on more than one server), security (no one server has access to the entire file), and cost (bandwidth expenses are distributed).
I am curious if anyone has an opinion on how I might be able to "trick" web browsers into downloading the various parts all in one link.
What I had in mind was something like:
Browser is linked to Server 1, which provides a content-size of the full file
Once 128MB is served, Server 1 will intentionally close the connection
Hopefully, the browser will try to restart the download, requesting Server 1
Server 1 provides a 3XX redirect to Server 2
Browser continues downloading from Server 2
I don't know for certain that my example works, as I haven't tested it yet. I was curious if there were other solutions someone might have?
I'd like to make the whole process as easy as possible (ideally requiring no work beyond a simple download). I don't want the users to have to use another program (ie: cat'ing the files together). I'd also like to not use a proxy server, since it would incur extra bandwidth costs.
As far as I'm aware, there is no javascript solution for writing a file, if there was one, that would be great.
AFAIK this is not possible by using the HTTP protocol. You can probably use a custom browser extension but it would depend on the browser. Another alternative is to create a Java applet that would download the file from different servers. The applet can accept the URLs to the different servers as parameters.
To save the generated file:
https://stackoverflow.com/a/4551467/329062
That solution stores the file in memory though, so it won't work with very large files.
You can download the partial files into a JS variable using JSONP. That will also let you get around the same-origin policy.
Javascripts security model will only allow you to access data from the same origin where the Javascript came from - i.e. not multiple servers.
If you are going to have the file bits on multiple servers, you will need the user to load the web page, fetch the bit and then finally stick the bits together in the correct order. If you can manage to get all your users to do this (correctly), you are a better man than I.
It's possible to do in modern browsers over standard HTTP.
You can use XHR2 with CORS to download file chunks as ArrayBuffers and then merge them using Blob constructor and use createObjectURL to send merged file to the user.
However, I suspect that browsers will store these objects in RAM, so it's probably a bad idea to use it for large files.

coordinating filesystem activity in nodejs

What is the best practice for coordinating access to files in node.js?
I'm trying to write an http based file uploader for very large files (10sGB) that is resumable. I'm trying to figure out what the best approach is to handle two people trying to upload the same file at the same time... I'm also trying to think ahead to the possibility where more than one copy of the node.js http server is running behind a load balancer, which means catching duplicate uploads can't rely on just the code itself.
In python, for example, you can create a file by passing the correct flags to the open() call to force an atomic create. Not sure if the default node.js open new file is atomic.
Another option I thought of, but don't really want to pursue, is using a database with an async driver that supports atomic transactions to track this state...
In order to know if multiple users are uploading the same file, you will have to identify the files somehow. Hashing is best for this. First, hash the entire file on the client side to identify it. Tell the server the hash of the file, if there is already a file on the server with the same hash, then the file has already been uploaded or is currently being uploaded.
Since this is an http file server, you will likely want users to upload files from a browser. You can get the contents of a file with a browser using the File Reader API. Unfortunately as of now this isn't widely supported. You might have to use something like flash to to get it to work in other browsers.
As you stream the file into memory with the file reader, you will want to break it into chunks and hash the chunks. Then send the server all of the file's hashed chunks. It's important that you break the file into chunks and hash those individual chunks instead of the contents of the entire file because otherwise the client could send one hash and upload an entire different file.
After the hashes are received and compared to other files' hashes and it turns out someone else is currently uploading the same file, the server then decides which user gets to upload which chunks of the file. The server then tells the uploading clients what chunks it wants from them, and the clients upload their corresponding chunks.
As each chunk is finished uploading, it is rehashed on the server and compared with the original array of hashes to verify that the user is uploading the correct file.
I found this on HackerNews under a response to someone complaining about some of the same things in node.js. I'll put it here for completeness. This allows me to at least lock some file writes in node.js like I wanted to.
IsaacSchlueter 4 hours ago | link
You can open a file with O_EXCL if you pass in the open flags as a
number. (You can find them on require("constants"), and they need to
be binary-OR'ed together.) This isn't documented. It should be. It
should probably also be exposed in a cleaner way. Most of the rest of
what you describe is APIs that need to be polished and refined a bit.
The boundaries are well defined at this point, though. We probably
won't add another builtin module at this point, or dramatically expand
what any of them can do. (I don't consider seek() dramatic, it's just
tricky to get right given JavaScript's annoying Number problems.)

Categories

Resources