I have been searching if there is a way to handle large scale HTTP requests with Express.js. I know Node.js has 'workers' starting at version 13 to do the multithreading, but in terms of handling multiple HTTP requests to a single endpoint, how would I go about doing that and scaling up?
For example, if 10,000 requests come at the same time, another thread would open up and deal with the other request to speed up the process. Does it do it automatically or do I need to configure something on it?
Thanks!
Express.js is designed with the event loop philosophy, expecting each request-response cycle to take negligible total CPU time (ignoring any waiting times due to I/O). This allows an Express.js server to efficiently respond to many requests on a single thread. To make this work, it is expected that any request that demands more than negligible CPU time will create a Worker thread, or spawn a process, to do the computation away from the event loop thread. This responsibility falls to the programmer.
If your Express.js app is still not able to keep up with the volume of the requests even though this philosophy is being followed, the solution is to scale up above its level using a load balancer, whether on a web server level (e.g. Apache httpd's mod_load_balancer), or on DNS level.
Related
I'm trying to understand node.js single threaded architecture and the eventloop to make our application more efficient. So consider this scenario where I have to make several database calls for an http api call. I can do it using Promise.all() or using a separate await.
example:
Using async/await
await inserToTable1();
await insertToTable2();
await updateTable3();
Using Promise.all() I can do the same by
await Promise.all[inserToTable1(), insertToTable2(), updateTable3()]
Here for one API hit at a given time, Promise.all() will be quicker to return the response as it fires the DB calls in parallel. But, if I have 1000 API hits per second, will there be any difference? For this scenario, is Promise.all() better for the eventloop?
Update
Assume the following,
By 1000 API hits, I meant the overall traffic to the application. Consider there are 20-25 APIs. Out of these a few might do DB operations, a few might make a few http calls, etc. Also, at no point we will be reaching the DB pool max connections.
Thanks in advance!!
As usual when it comes to system design, the answer is: it depends.
There are a lot of factors that determines the performance of either. In general, awaiting a single Promise.all() waits for all requests in parallel.
Event Loop
The event loop uses exactly 0% CPU time to wait for a request. See my answer to this related question for an explanation of how exactly the event loop works: Performance of NodeJS with large amount of callbacks
So from the event loop point of view there is no real difference between requesting sequentially and requesting in parallel with a Promise.all(). So if this is the core of your question I guess the answer is there is no difference between the two.
However, processing the callbacks does take CPU time. Again, the time to complete executing all the callbacks are the same. So from the point of view of CPU performance again there is no difference between the two.
Making requests in parallel does reduce overall execution time however. Firstly if the service is multithreaded you are essentially using it's multithreadedness by making parallel requests. This is what makes node.js fast even though it's single threaded.
Even if the service you are requesting from isn't multithreaded and actually handle requests sequentially, or if the server you're requesting from is a single core CPU (rare these days but you can still rent single-core virtual machines) then parallel requests reduces networking overhead since your OS can send multiple requests in a single Ethernet frame thus amortizing the overhead of packet headers over several requests. This does have a diminishing return beyond around half a dozen parallel requests however.
One Thousand Requests
You've hypothesized making 1000 requests. Weather or not awaiting 1000 promises in parallel actually causes parallel requests depends on how the API works at the network level.
Connection pools.
Lots of database libraries implement connection pools. That is, the library will open some number of connections to the database, for example 5, and reuse the connections.
In some implementation, making 1000 requests via such a library will cause the low-level networking code of the library to batch them 5 requests at a time. This means that at most you can have 5 parallel requests (assuming a pool of 5 connections). In this case it is perfectly safe to make 1000 parallel requests.
Some implementations however have a growable connection pool. In such implementations making 1000 parallel requests will cause your software to open 1000 sockets to access the remote resource. In such cases how safe it is to make 1000 parallel requests will depend on weather the remote server allows this.
Connection limit.
Most databases such as Mysql and Postgresql allows the admin to configure a connection limit, for example 5, such that the database will reject more than the limited number of connections per IP address. If you use a library that does not automatically manage maximum connections to your database then your database will accept the first 5 requests and reject the remaining until another slot is available (it's possible that a connection is freed before node.js finishes opening the 1000th socket). In this case you cannot successfully make 1000 parallel requests - you need to manage how many parallel requests you make.
Some API services also limit the number of connections you can make in parallel. Google Maps for example limits you to 500 requests per second. Therefore awaiting 1000 parallel requests will cause 50% of your requests to fail and possibly cause your API key or IP address to be banned.
Networking limits.
There is a theoretical limit on the number of sockets your machine or a server can open. However this number is extremely high so it's not worth discussing here.
However, all OSes that is currently in existence limit the maximum number of open sockets. On Linux (eg Ubuntu & Android) and Unix (eg MacOSX and iOS) sockets are implemented as file descriptors. And there is a maximum number of file descriptors allocated per process.
For Linux this number usually defaults to 1024 files. Note that a process opens 3 file descriptors by default: stdin, stdout and stderr. That leaves 1021 file descriptors shared by files and sockets. So your 1000 request in parallel skirts very close to this number and may fail if two clients try to make 1000 parallel requests at the same time.
This number can be increased but it does have a hard limit. The current maximum number of file descriptors you can configure on Linux is 590432. However this extreme configuration only works properly on a single user system with no daemons (or other background programs) running.
What to do?
The first rule when writing networking code is try not to break the network. Be reasonable in the number of requests you make at any one time. You can batch your requests to the limit of what the service expects.
With async/await it's easy. You can do something like this:
let parallel_requests = 10;
while (one_thousand_requests.length > 0) {
let batch = [];
for (let i=0;i<parallel_requests;i++) {
let req = one_thousand_requests.pop();
if (req) {
batch.push(req());
}
}
await Promise.all(batch);
}
Generally the more requests you can make in parallel the better (shorter) overall process time will be. I guess this is what you wanted to hear. But you need to balance parallelism with the factors above. 5 is generally OK. 10 maybe. 100 will depend on the server responding to the requests. 1000 or more and the admin who installed the server will probably have to tune his OS.
await approach will suspend the function execution for every await call and execute them sequentially while Promise.all can execute things parallel (in async) and return success when all of them are successful.
So it's better to use Promise.all if your three (inserToTable1(), insertToTable2(), table3()) methods are independent.
The ability of javascript to execute other stuff while a heavy operations are happening by suspending is achieved through event loops and call stacks.
Event Loops
The decoupling of the caller from the response allows for the JavaScript runtime to do other things while waiting for your asynchronous operation to complete and their callbacks to fire.
JavaScript runtimes contain a message queue which stores a list of messages to be processed and their associated callback functions. These messages are queued in response to external events (such as a mouse being clicked or receiving the response to an HTTP request) given a callback function has been provided.
The Event Loop has one simple job — to monitor the Call Stack and the Callback Queue. If the Call Stack is empty, it will take the first event from the queue and will push it to the Call Stack, which effectively runs it.
My node.js app currently subscribes to a number of websocket servers that is starting to push a lot of data over to the app every second. This websocket client app has several event handlers that does some work upon receiving the websocket data.
However, Node.js appears to be only using 1 CPU core at any one time, leave the remaining cores under utilized. This is expected as Node.js uses a single-threaded event loop model.
Is it possible to load balance the incoming websocket data handling over multiple CPU cores? I understand that Node Cluster and pm2 Cluster Mode are able to load balance if you are running websocket servers, but how about websocket clients?
From the client side, I can think of the following options:
Create some child processes (likely one for each CPU core you have) and then divide your webSocket connections among those child processes.
Create node.js WorkerThreads (likely one for each CPU core you have) and then divide your webSocket connections among those WorkerThreads.
Create node.js WorkerThreads (likely one for each CPU core you have) and create a work queue where each incoming piece of data from the various webSocket connections is put into the work queue. Then, the WorkerThreads are regular dispatched data from the queue to work on. As they finish a piece of data, they are given the next piece of data from the queue and so on...
How to best solve this issue really depends upon where the CPU time is mostly being spent. If it is the processing of the incoming data that is taking the most time, then any of these solutions will help apply multiple CPUs to that task. If it is the actual receiving of the incoming data, then you may need to move the incoming webSockets themselves to a new thread/process as in the first two options.
If it's a system bandwidth issue due to the volume of data, then you may need to increase the bandwidth of your network connection of may need multiple network adapters involved.
UPDATE
I have a few questions about the combination of Nginx and Nodejs.
I've used Nodejs to create my server and now I'm facing with an issue about catching the server for an actions (writing, removing and etc..).
We are using Redis to lock the server when there are requests to the server, for example if a new user is doing a sign up action all the rest of the requests are waiting until the process is done, or if there is another process (longer one) all the other requests will wait longer.
We thought about creating a Load balancer (using Nginx) that will check if the server is locked, and if the server is locked it will open a new task and won't wait until the first process is done.
I used this tutorial and created a dummy server, then I've struggled with the idea of do this functionality of opening a new ports.
I'm new with load balancing implementation and I will be happy to hear your thoughts and help.
Thank you.
The gist of it is that your server needs to not crash if more than one connection attempt are made to it. Even if you use NGINX as a load balancer and have five different instances of your server running...what happens when six clients try to access your app at once?
I think you are thinking about load balancers slightly wrong. There are different load balancing methods, but the simplest one to think about is "round robin" in which each connection gets forwarded to the next server in the list (the rest are just more robust and complicated versions of this one). When there are no more servers to forward to, the next connection gets forwarded to the first server again (whether or not it is done with its last connection) and the circle starts over. Thus, load balancers aren't supposed to manage "unique connections" from clients...they are supposed to distribute connections among servers.
Your server doesn't necessarily need to accept connections and handle them all at once. But it needs to at least allow connections to queue up without crashing, and then accept and deal with each one by one.
You can go the route you are discussing. That is, you can fire up a unique instance of your server...via Heroku or other...for every single connection that is made to your app. But this is not efficient and will ultimately create more work for you in trying to architect a system that can do that well. Why not just fix your server?
I am wondering if node.js is good for use in a server side application which is not actually communicating with the browser, or browser communication is just an additional part of whole app used rather for management.
The idea is simple:
Server receives high amount of UDP traffic with short messages containing user data from another server.
For each message app performs DB lookup and filter out messages with userid's that are not on the whitelist.
Filtered messages are processed, which result in another DB update, or sending data to another server.
Is such case, a good scenario to learn node.js, or maybe there is no benefit from it comparing to e.g Java EE?
Disclaimer: I work for a company that contributes to node.js and promotes its usage, so my opinion might be biased.
As others mentioned in comments, node.js should be a good fit for you scenario. It is actually one of the most common scenarios where people use node.js - fetch data from (possibly multiple) sources, do a small amount of CPU-light processing and send back the response or store the result. Unless message filtering is very CPU expensive, node.js implementation will probably outperform J2EE version.
The reason is that Node.js is heavily optimised for solutions where the server spends most of the time waiting. Waiting for client connection, waiting for database response, waiting for disc read/write, waiting for client to read the response, etc.
J2EE is employing multi-threading, where you have one thread to handle each request, which is suboptimal in this case. Most threads are waiting, so you are not getting the benefit of running lots of code in parallel, but you still have to pay the price of context switching and higher memory usage.
There is one thing I would consider before going for node.js: are you able and allowed to deploy node.js into your production environment? Moving to a new platform has some associated costs, people operating your application will have to learn how to deal with node.js applications.
I want to handle a lot of (> 100k/sec) POST requests from javascript clients with some kind of service server. Not many of this data will be stored, but I have to process all of them so I cannot spend my whole server power for serving requests only. All the processing need to be done in the same server instance, otherwise I'll need to use database for synchronization between servers which will be slower by orders of magnitude.
However I don't need to send any data back to the clients, and they don't even expect them.
So far my plan was to create few proxy servers instances which will be able to buffer the request and send them to main server in bigger packs.
For example let's say that I need to handle 200k requests / sec and each server can handle 40k. I can split load between 5 of them. Then each one will be buffering requests and sending them back to main server in packs of 100. This will result in 2k requests / sec on the main server (however, each message will be 100 times bigger - which probably means around 100-200kB). I could even send them back to the server using UDP to decrease amount of needed resources (then I need only one socket on main server, right?).
I'm just thinking if there is no other way to speed up the things. Especially, when as I said I don't need to send anything back. I have full control over javascript clients also, but unlucky javascript is unable to send data using UDP which probably would be solution for me (I don't even care if 0.1% of data will be lost).
Any ideas?
Edit in response to answers given me so far.
The problem isn't with server being to slow at processing events from the queue or with putting events in the queue itself. In fact I plan to use disruptor pattern (http://code.google.com/p/disruptor/) which was proven to process up to 6 million requests per second.
The only problem which I potentially can have is need to have 100, 200 or 300k sockets open at the same time, which cannot be handled by any of the mainstream servers. I know some custom solutions are possible (http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3) but I'm wondering if there is no way to even better utilization of fact that I don't have to replay to clients.
(For example some way to embed part of the data in initial TCP packet and handle TCP packets as they would be UDP. Or some other kind of magic ;))
Make a unique and fast (probably in C) function that get's all requests, from a very fast server (like nginx). The only job of this function is to store the requests in a very fast queue (like redis if you got enought ram).
In another process (or server), depop the queue and do the real work, processing request one by one.
If you have control of the clients, as you say, then your proxy server doesn't even need to be an HTTP server, because you can assume that all of the requests are valid.
You could implement it as a non-HTTP server that simply sends back a 200, reads the client request until it disconnects, and then queues the requests for processing.
I think what you're describing is an implementation of a Message Queue. You also will need something to hand off these requests to whatever queue you use (RabbitMQ is quite good, there are many alternatives).
You'll also need something else running which can do whatever processing you actually want on the requests. You haven't made that very clear, so I'm not too sure exactly what would be right for you. Essentially the idea will be that incoming requests are dumped as quickly as simply as possible into the queue by your web server, and then the web server is free to go back to serving more requests. When the system has some resources, it uses them to process the queue, but when it's busy the queue just keeps growing.
Not sure what platform you're on, but might want to look at something like Lighttpd for serving the POSTs. You might (if same-domain restrictions don't shoot you down) get away with having Lighttpd running on a subdomain of your application (so post.myapp.com). Failing that you could put a proper load balancer in front of your webservers altogether (so all requests go to www.myapp.com and the load balancer decides whether to forward them to the web server or the queue processor).
Hope that helps
Consider using MongoDB for persisting your requests, it's fire and forget mechanism can help your servers to response faster.