Node.js: Express set the "trust proxy" for CloudFront - javascript

I have a Express backend behind AWS Cloudfront. How properly set trust proxy for AWS Cloud Front:
app.set('trust proxy', function (ip) {
if ( ???????????? ) return true; // trusted IPs
else return false;
});
AWS Cloudfront use tons of ip address and is insecure validate all AWS ip address because anyone with an AWS EC2 instance have a valid IP.

The Problem
As you mentioned AWS CloudFront uses a long list of IP Address ranges. It's mentioned in their documenation. You can see them via this one liner (source, requires jq which you can get from brew in MacOs.):
curl 'https://ip-ranges.amazonaws.com/ip-ranges.json' | jq -r '.prefixes[] | select(.service=="CLOUDFRONT") | .ip_prefix'
(Update: or directly from http://d7uri8nf7uskq.cloudfront.net/tools/list-cloudfront-ips as mentioned in their doc.)
Right now, April 2021, it is giving me 122 ranges.
The Solution
You can make an AJAX call to this file in Node, parse the JSON file, get the list as an array of string (cloudFrontIps), and pass that to app.set('trust proxy', ['loopback', ...cloudFrontIps]).
Good news!
The good news is someone else has already done it! Check https://github.com/nhammond101/cloudfront-ip-ranges out.
Final Notes
It's obvious, but worth mentioning that getting this list in asynchronous! So, you might want to delay (e.g. await) your app start until this list is available. It's not a must though -- calling app.set after the HTTP server is up should work, thought for that short duration you will be recording CloudFront's IP.
You might want to call this file and get the new list periodically. The package is suggesting every 12 hours, using setTimeout.
My understanding is calling app.set on a running server will make the new list applicable on future calls immediately, without needing to restart. I am getting this impression by how X-Forward-For is examined on every request, and how app.set is calling compileTrust function on it's invocation. So, TL;DR: You shouldn't be needing to restart the server every 12 hours for this!
I look at express's code and it seems like app.set overrides (and not appends) the list every time you call it. So if you have some IPs of your own (e.g. your VPC's CIDR in AWS ELB), you have to manually add it to the list every time you call this app.set in your setTimeout.

Related

Fastest redirects Javascript

My main function is I am creating a link-shortening app. When someone entered a long URL, it will give a short URL. If the user clicked on the short link it will search for the long URL on the DB and redirect it to the long URL.
Meantime I want to get the click count and clicked user's OS.
I am currently using current code :
app.get('/:shortUrl', async (req, res) => {
const shortUrl = await ShortUrl.findOne({short: req.params.shortUrl})
if (shortUrl == null) return res.sendStatus(404)
res.redirect(shortUrl.full)
})
findOne is finding the Long URL on the database using ShortID. I used mongoDB here
My questions are :
Are there multiple redirect methods in JS?
Is this method work if there is a high load?
Any other methods I can use to achieve the same result?
What other facts that matter on redirect time
What is 'No Redirection Tracking'?
This is a really long question, Thanks to those who invested their time in this.
Your code is ok, the only limitation is where you run it and mongodb.
I have created apps that are analytics tracker, handling billion rows per day.
I suggest you run your node code using AWS Beanstalk APP. It has low latency and scales on your needs.
And you need to put redis between your request and mongodb, you will call mongodb only if your data is not yet in redis. Mongodb has more read limitations than a straight redis instance.
Are there multiple redirect methods in JS?
First off, there are no redirect methods in Javascript. res.redirect() is a feature of the Express http framework that runs in nodejs. This is the only method built into Express, though all a redirect response consists of is a 3xx (often 302) http response status and setting the Location header to the redirect location. You can manually code that just as well as you can use res.redirect() in Express.
You can look at the res.redirect() code in Express here.
The main things it does are set the location header with this:
this.location(address)
And set the http status (which defaults to 302) with this:
this.statusCode = status;
Then, the rest of the code has to do with handling variable arguments, handling an older design for the API and sending a body in either plain text or html (neither of which is required).
Is this method work if there is a high load?
res.redirect() works just fine at a high load. The bottleneck in your code is probably this line of code:
const shortUrl = await ShortUrl.findOne({short: req.params.shortUrl})
And, how high a scale that goes to depends upon a whole bunch of things about your database, configuration, hardware, setup, etc... You should probably just test how many request/sec of this kind your current database can handle.
Any other methods I can use to achieve the same result?
Sure there are. But, you will have to use some data store to look up the shortUrl to find the long url and you will have to create a 302 response somehow. As said earlier, the scale you can achieve will depend entirely upon your database.
What other facts that matter on redirect time
This is pretty much covered above (hint, its all about the database).
What is 'No Redirection Tracking'?
You can read about it here on MDN.

Need some clarification on nodejs concepts

I am starting to learn more about how this "web world" works and that's why I am taking the free code camp course. I already took front-end development and I really enjoyed it. Now I am on the back end part.
The back end is much more foggy for me. There are many things that I don't get so I would hope that someone could help me out.
First of all I learned about the get method. so I did:
var http = require('http');
and then made a get request:
http.get(url, function callBack(response){
response.setEncoding("utf8");
response.on("data", function(data){
console.log(data);
});
});
Question 1)
So apparently this code "gets" a response from a certain URL. but What response? I didn't even ask for anything in particular.
Moving on...
The second exercise asks us to listen to a TCP connection and create a server and then write the date and time of that connection. So here's the answer:
var server = net.createServer(function listener (socket){
socket.end(date);
});
server.listen(port);
Question 2)
Okay so I created a TCP server with net.createServer() and when the connection was successful I outputted the date. But where? What did actually happen when I put date inside of socket.end()?
Last but not least...
in the last exercise I was told to create an HTTP server (what?) to server a text file for every time it receives requests, and here's what I did:
var server = http.createServer(function callback(request, response){
var read = fs.createReadStream(location);
read.pipe(response);
});
server.listen(port);
Question 3)
a) Why did I have to create an HTTP server instead of a regular TCP? what's the difference?
b)what does createReadStream do?
c) What does pipe() do?
If someone could help me, trying to make the explanation easier would help me a lot since I am, as you can see, pretty dumb on this subject.
Thank you a lot!
This is a little broad for Stackoverflow which favors focused questions that address specific problems. But I feel your pain, so…
Questions 1:
Http.get is roughly equivalent to requesting a webpage. The url in the function is the page you are requesting. The response will include several things like the HTTP response code, but also (most importantly) the content of the page, which is what you are probably after. On the backend this is normally used for hitting APIs that get data rather than actual web pages, but the transport mechanism is the same.
Question 2:
When you open a socket, you are waiting for someone else to request a connection. (The way you do when you use http.get(). When you output data you are sending them a response like the one you received in question 1.
Question 3:
HTTP is a higher level protocol than TCP. This basically means it is more specific and TCP is more general (pedants will take issue with that statement, but it's an easy way to understand it). HTTP defines the things like GET and POST that you use when you download a webpage. Lower down in the protocol stack HTTP uses TCP. You could just use TCP, but you would have to do a lot more work to interpret the requests that come in. The HTTP library does that work for you. Other protocols like FTP also use TCP, but they are different protocol than HTTP.
For this answer, you need to understand two things. An IP address is the numeric value of a website, it's the address to the server pointing to the site. A domain name is a conversion from IP to a NAMED system which allows humans an easier way to see the names of websites, so instead of typing numbers for websites, like 192.168.1.1, we can now just type names (www.hotdog.com). That's what your get request is doing, it's requesting the site.
socket.end is a method you're calling. socket.end "Half-closes the socket. i.e., it sends a FIN packet. It is possible the server will still send some data" from the nodejs.org docs, so basically it half closes your socket at the parameter you're sending in, which is todays current date.
HTTP is hyper text transfer protocol, TCP (transmissioncontrol protocol) is a link between two computers
3a HTTP is for browsers, so that's why you did it, for a web page you were hosting locally or something.
3b createreadstream() Returns a new ReadStream object. (See Readable Stream).
Be aware that, unlike the default value set for highWaterMark on a readable stream (16 kb), the stream returned by this method has a default value of 64 kb for the same parameter.
3c pipe:
The 'pipe' event is emitted when the stream.pipe() method is called on a readable stream, adding this writable to its set of destinations.

Modifying Pubnub presence heartbeat for Python

According to the presence documentation, Pubnub will fire the Timeout presence event after 5 minutes of not receiving a heartbeat.
I need to modify this value but I cannot find a way of doing this with the Python SDK. Here is a link to the Pubnub docs showing how to do it with JavaScript: http://www.pubnub.com/docs/web-javascript/presence#optimizing_timeout_events
Does anyone know how to achieve this using the python SDK?
Thanks a lot.
edit: My clients are not javascript clients. They are python console applications.
Heartbeat can be monkey-patched into the Pubnub class with something like this:
from pubnub import Pubnub
class PubnubHeartbeat(Pubnub):
def __init__(self, heartbeat=300, **kwargs):
self.heartbeat = heartbeat
super(PubnubHeartbeat, self).__init__(**kwargs)
def getUrl(self, request):
if "subscribe" in request['urlcomponents'][:2]:
if "urlparams" not in request:
request['urlparams'] = {}
request['urlparams']['heartbeat'] = self.heartbeat
return super(PubnubHeartbeat, self).getUrl(request)
p = PubnubHeartbeat(
subscribe_key="demo",
publish_key="demo",
heartbeat=60
)
def recv(msg):
print msg
p.subscribe(channels="heartbeat_test", callback=recv)
This isn't recommended for long-term production code (unless maybe if you are pinning your Pubnub dependency with pubnub==3.7.3 during install). The example subclass uses an undocumented method to inject the heartbeat URL parameter. (See Craig Conover's answer for a description of what that does).
PubNub Python SDK Presence
Because Python is rarely used as a client, the PubNub Python SDK's presence API is not as robustly implemented as the traditional client SDKs (JavaScript, etc.). So there is no heartbeat parameter in the Pubnbub intitializer nor is there a setter or attribute for this so you are forced to stick with the default 5 minute heartbeat setting.
However, with the PubNub JavaScript SDK, when you init PUBNUB with a custom heartbeat (60 seconds for example), the heartbeat key/value is just passed along as a query param in the REST URL:
http://pubsub.pubnub.com/subscribe/demo/my_channel/0/14411482999795083?uuid=12345&pnsdk=PubNub-JS-Web%2F3.7.14&heartbeat=60
So if you really wanted to, you could just subscribe using REST calls and pass the heartbeat in that way.
What I forgot to mention when I first posted this answer is that your client is responsible for pinging the PubNub server at least once every 60 seconds, preferably on a 30 second interval this the 60 second heartbeat window that the server is configured for this client.
With the PubNub SDK, this is done in a separate thread over the same connection (sort of - at least in a way that the server knows that it is the same client that set the heartbeat).
That said, we are getting into a less trivial solution using REST and so why even use the SDK. It would be easier for us to update the Python SDK than for you to do all the dirty work. We will do just that but not in the short term but hopefully with the next minor release of the Python SDK.
Based on our off-SO conversation, you just want to shorten the window of time that a client will appear to be online when in fact the client is not connected and was unable to explicitly unsubscribe before the connection was closed (closed the terminal instead of "logging off" using your app's UI or command line).
What you can do is implement a ping/ack handshake protocol. This is very high level so there may be some finer points that need to be filled in but it should provide the general concept.
Before one client (sender) engages in communication with another (receiver), just send a ping message to the other client on the client’s private channel (every client will subscribe to a channel unique to that client: for example, private_client001, private_client002, etc.).
The receiving client will auto-ack back on the sender’s unique channel (which will be part of the ping msg payload)
If the sender of the ping doesn’t get an ack msg back within a second (or whatever time tolerance works for you) then assume the receiver is not online.
When the receiver comes back online, you get missed messages, and any pings that are less than 5 minutes old, you can ack back and see if the sender still wants to engage.
This is a common issue for many use cases (especially chat) because there is always that window of time (the heartbeat window) that a client could really be offline but appear to be online because they did not leave in proper, predictable fashion that would have produced an explicit unsubscribe resulting in a leave event. So implementing this sort of handshake pre-connect protocol is a good practice.

Using the Tor api to make an anonymous proxy server

I am making an app which makes lots of api calls to some site. The trouble I've run into is that the site has a limit on the number of api calls that can be made per minute. To get around this I was hoping to use Tor in conjunction with node-http-proxy to create a proxy table which uses anonymous ip addresses taken from the tor api.
So my question is, how possible is this, and what tools would you recommend for getting it done. My app is written in javascript, so solutions involving things like node-tor are preferable.
I've found a reasonable solution using tor and curl command line tools via Node.js.
Download the tor command-line tool and set it in your $PATH.
Now, we can send requests through this local tor proxy which will establish a "circuit" through the TOR network. Let's see our IP address using http://ifconfig.me. You can copy paste all of these things into your Node REPL:
var cp = require('child_process'),
exec = cp.exec,
spawn = cp.spawn,
tor = spawn('tor'),
puts = function(err,stdo,stde){ console.log(stdo) },
child;
After this, you may want to build in a delay while the tor proxy is spawned & sets itself up.
Next, let's go through the TOR network and ask http://ifconfig.me what IP address is accessing it.
function sayIP(){
child = exec('curl --proxy socks5h://localhost:9050 http://ifconfig.me',puts);
}
sayIP();
If you want a new IP address, restarting tor by turning it off and then on seems to be the most reliable method:
function restartTor(){
tor.kill('SIGINT');
tor = spawn('tor');
}
restartTor();
Note: There is another way I've seen people describe getting a new IP address (setting up a new "circuit") on the fly, but it only seems to work about 10% of the time in my tests. If you want to try it:
Find & copy torrc.sample to torrc, then change torrc as follows:
Uncomment ControlPort 9051 (9050 is the local proxy, opening 9051 lets us control it)
Uncomment & set CookieAuthentication 0.
Uncomment HashedControlPassword and set to result of:
$ tor --hash-password "your_password"
Then you could use a function like this to send a NEWNYM signal to your local tor proxy to try getting a new IP address without restarting.
function newIP(){
var signal = 'echo -e "AUTHENTICATE \"your_password\"\r\nsignal NEWNYM\r\nQUIT" | nc -v 127.0.0.1 9051';
child = exec(signal,puts);
}
newIP();

Trying to setup a node.js web server

I am new to web servers and node.js and I need some help.
I have no idea what to put in the .listen();
I think since I want it to connect to the internet the server needs to listen to port 80 but but I don't know what to put as the second value.
.listen(80, "What do I add here?");
Also i have a free domain name (www.example.co.cc) that is pointing to a dynamic dns (DnsExit) since I dynamic ip. I installed to program needed to update my ip address.
Is there anything I am missing?
Have you seen the example on the homepage of the Node.js project?
http://nodejs.org/
It clearly demonstrated .listen( 1337, "127.0.0.1" ); and then the next line reads Server running at http://127.0.0.1:1337/ - so the second argument is the IP you want to listen on. If you then take a look at the documentation you will see that this second argument is actually optional, if you omit it, Node.js will accept incoming connections directed at any IPv4 address.
http://nodejs.org/docs/v0.5.6/api/http.html#server.listen

Categories

Resources