I did some search on the question, but it seems like people only emphasize on Non-blocking IO.
Let's say if I just have a very simple application to respond "Hello World" text to the client, it still needs time to finish the execution, no matter how quick it is. What if there are two request coming in at exactly the same time, how does Node.js make sure both requests will be processed with one thread?
I read the blog Understanding the node.js event loop which says "Of course, on the backend, there are threads and processes for DB access and process execution". That statement is regarding IO, but I also wonder if there is separate thread to handle the request queue. If that's the case, can I say that the Node.js single thread concept only applies to the developers who build applications on Node.js, but Node.js is actually running on multi-threads behind the scene?
The operating system gives each socket connection a send and receive queue. That is where the bytes sit until something at the application layer handles them. If the receive queue fills up no connected client can send information until there is space available in the queue. This is why an application should handle requests as fast as possible.
If you are on a *nix system you can use netstat to view the current number of bytes in the send and receive queues. In this example, there are 0 bytes in the receive queue and 240 bytes in the send queue (waiting to be sent out by the OS).
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 240 x.x.x.x:22 x.x.x.x:* LISTEN
On Linux you can check the default size and max allowed size of the send/receive queues with the proc file system:
Receive:
cat /proc/sys/net/core/rmem_default
cat /proc/sys/net/core/rmem_max
Send:
cat /proc/sys/net/core/wmem_max
cat /proc/sys/net/core/wmem_default
Related
I'm writing a TCP server application using NodeJS. However, each socket runs on a separate child-process (server.on("connection")). To send messages to specific clients, I used Emitter, and each socket generates its own listener (on clientID). So if there are 10000 connected devices, the application will create 10000 listeners. This looks terrible. What dangers will this pose? I can't find a solution to send a message from one client to another in the TCP protocol writing NodeJS code.
Update:
Have any idea to send message to specific client without add custom listeners?
However, each socket runs on a separate process.
Why would you do that? The core idea behind NodeJS is to run things in an event loop. Single threaded, yes, but asynchronous.
This looks terrible. What dangers will this pose?
It is terrible. The biggest issue is that you sacrifice a lot of resources. You not only spawn thousands of processes but you also spawn lots of emitters. So first of all this means lots of RAM eaten. Secondly this means degraded performance due to process context switch, which typically is slower than user space switch. Assuming your machine will even allow you to spawn so many processes.
I can't find a solution to send a message from one client to another in the TCP protocol writing NodeJS code.
I assume you have a TCP server, two connected clients and client A wants to send message to client B. Is that correct? TCP by itself won't do that for you. You need some protocol on top of it. For example:
Client connects to the server. At this point the client is not logged in and cannot do anything except for authentication.
Client authenticates. It sends (username, password) pair to the server. The server validates the pair. The server keeps a global mapping {"<username>": [sockets]} and adds newly authenticated client to that mapping.
Client A wants to send a message to client B. So it sends data of the form {"type": "direct", "destination": "clientB", "data": "hello B"}. The server parses the message and forwards it to the appropriate client (taken from the global mapping).
In case when you want to broadcast the message you send say {"type":"broadcast", "data": "hello all"} kind of message. The server then parses it, it loops through all connected clients (found in the global mapping) and forwards the message to each client.
Of course you also need some framing of packets. Since TCP is a stream, then it doesn't really understand messages and where one starts and the other ends. Dumping things to JSON is a half of the problem. Because then you have to send this JSON over the network and the other side has to know how many bytes it has to read. One way is to prefix each message with, say, 2 bytes that tell the other side how long the message is.
Btw you may want to consider using socket.io (or some other lib) that take care of some of those tedious details for you.
My node.js app currently subscribes to a number of websocket servers that is starting to push a lot of data over to the app every second. This websocket client app has several event handlers that does some work upon receiving the websocket data.
However, Node.js appears to be only using 1 CPU core at any one time, leave the remaining cores under utilized. This is expected as Node.js uses a single-threaded event loop model.
Is it possible to load balance the incoming websocket data handling over multiple CPU cores? I understand that Node Cluster and pm2 Cluster Mode are able to load balance if you are running websocket servers, but how about websocket clients?
From the client side, I can think of the following options:
Create some child processes (likely one for each CPU core you have) and then divide your webSocket connections among those child processes.
Create node.js WorkerThreads (likely one for each CPU core you have) and then divide your webSocket connections among those WorkerThreads.
Create node.js WorkerThreads (likely one for each CPU core you have) and create a work queue where each incoming piece of data from the various webSocket connections is put into the work queue. Then, the WorkerThreads are regular dispatched data from the queue to work on. As they finish a piece of data, they are given the next piece of data from the queue and so on...
How to best solve this issue really depends upon where the CPU time is mostly being spent. If it is the processing of the incoming data that is taking the most time, then any of these solutions will help apply multiple CPUs to that task. If it is the actual receiving of the incoming data, then you may need to move the incoming webSockets themselves to a new thread/process as in the first two options.
If it's a system bandwidth issue due to the volume of data, then you may need to increase the bandwidth of your network connection of may need multiple network adapters involved.
I have read a lot of article about how NodeJs works. But I still can not figure out exactly how the internal threads of Nodejs proceed IO operations.
In this answer https://stackoverflow.com/a/20346545/1813428 , he said there are 4 internal threads in the thread pool of NodeJs to process I/O operations . So what if I have 1000 request coming at the same time , every request want to do I/O operations like retrieve an enormous data from the database . NodeJs will deliver these request to those 4 worker threads respectively without blocking the main thread . So the maximum number of I/O operations that NodeJs can handle at the same time is 4 operations. Am I wrong?.
If I am right , where will the remaining requests will handle?. The main single thread is non blocking and keep driving the request to corresponding operators , so where will these requests go while all the workers thread is full of task? .
In the image below , all of the internal worker threads are full of task , assume all of them need to retrieve a lot of data from the database and the main single thread keep driving new requests to these workers, where will these requests go? Does it have a internal task queuse to store these requests?
The single, per-process thread pool provided by libuv creates 4 threads by default. The UV_THREADPOOL_SIZE environment variable can be used to alter the number of threads created when the node process starts, up to a maximum value of 1024 (as of libuv version 1.30.0).
When all of these threads are blocked, further requests to use them are queued. The API method to request a thread is called uv_queue_work.
This thread pool is used for any system calls that will result in blocking IO, which includes local file system operations. It can also be used to reduce the effect of CPU intensive operations, as #Andrey mentions.
Non-blocking IO, as supported by most networking operations, don't need to use the thread pool.
If the source code for the database driver you're using is available and you're able to find reference to uv_queue_work then it is probably using the thread pool.
The libuv thread pool documentation provides more technical details, if required.
In the image below , all of the internal worker threads are full of task , assume all of them need to retrieve a lot of data from the database and the main single thread keep driving new requests to these workers
This is not how node.js use those threads.
As per Node.js documentation, the threads are used like this:
All requests and responses are "handled" in the main thread. Your callbacks (and code after await) simply take turns to execute. The "loop" between the javascript interpreter and the "event loop" is usually just a while loop.
Apart from worker_threads that you yourself start there are only 4 things node.js use threads for: waiting for DNS response, disk I/O, the built-in crypto library and the built-in zip library. Worker_threads are the only places where node.js execute javascript outside the main thread. All other use of threads execute C/C++ code.
If you are want to know more then I've written several answers to related questions:
Node js architecture and performance
how node.js server is better than thread based server
node js - what happens to incoming events during callback excution
Does javascript process using an elastic racetrack algorithm
Is there any other way to implement a "listening" function without an infinite while loop?
no, main use case for thread pool is offloading CPU intensive operations. IO is performed in one thread - you don't need multiple threads if you are waiting external data in parallel, and event loop is exactly a technique to organise execution flow so that you wait for as much as possible in parallel
Example:
You need to send 100 emails with a question (y/n) and another one with number of answered "y". It takes about 30 second to write email and two hours on average for reply + 10 seconds to read response. You start by writing all 100 emails ( 50 minutes of time ), then you wait alert sound which wakes you up every time reply arrives, and as you receive answers you increase number of "y". in ~2 hours and 50 minutes you're done. This is example of async IO and event loop ( no thread pools )
Blocking example: send email, wait for answer, repeat. Takes 4 days (two if you can clone another you )
Async thread pool example: each response is in the language you don't know. You have 4 translator friends. You email text to them, and they email translated text back to you (Or, more accurate: you print text and put it into "needs translation" folder. Whenever translator available, text is pulled from folder)
I want to handle a lot of (> 100k/sec) POST requests from javascript clients with some kind of service server. Not many of this data will be stored, but I have to process all of them so I cannot spend my whole server power for serving requests only. All the processing need to be done in the same server instance, otherwise I'll need to use database for synchronization between servers which will be slower by orders of magnitude.
However I don't need to send any data back to the clients, and they don't even expect them.
So far my plan was to create few proxy servers instances which will be able to buffer the request and send them to main server in bigger packs.
For example let's say that I need to handle 200k requests / sec and each server can handle 40k. I can split load between 5 of them. Then each one will be buffering requests and sending them back to main server in packs of 100. This will result in 2k requests / sec on the main server (however, each message will be 100 times bigger - which probably means around 100-200kB). I could even send them back to the server using UDP to decrease amount of needed resources (then I need only one socket on main server, right?).
I'm just thinking if there is no other way to speed up the things. Especially, when as I said I don't need to send anything back. I have full control over javascript clients also, but unlucky javascript is unable to send data using UDP which probably would be solution for me (I don't even care if 0.1% of data will be lost).
Any ideas?
Edit in response to answers given me so far.
The problem isn't with server being to slow at processing events from the queue or with putting events in the queue itself. In fact I plan to use disruptor pattern (http://code.google.com/p/disruptor/) which was proven to process up to 6 million requests per second.
The only problem which I potentially can have is need to have 100, 200 or 300k sockets open at the same time, which cannot be handled by any of the mainstream servers. I know some custom solutions are possible (http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3) but I'm wondering if there is no way to even better utilization of fact that I don't have to replay to clients.
(For example some way to embed part of the data in initial TCP packet and handle TCP packets as they would be UDP. Or some other kind of magic ;))
Make a unique and fast (probably in C) function that get's all requests, from a very fast server (like nginx). The only job of this function is to store the requests in a very fast queue (like redis if you got enought ram).
In another process (or server), depop the queue and do the real work, processing request one by one.
If you have control of the clients, as you say, then your proxy server doesn't even need to be an HTTP server, because you can assume that all of the requests are valid.
You could implement it as a non-HTTP server that simply sends back a 200, reads the client request until it disconnects, and then queues the requests for processing.
I think what you're describing is an implementation of a Message Queue. You also will need something to hand off these requests to whatever queue you use (RabbitMQ is quite good, there are many alternatives).
You'll also need something else running which can do whatever processing you actually want on the requests. You haven't made that very clear, so I'm not too sure exactly what would be right for you. Essentially the idea will be that incoming requests are dumped as quickly as simply as possible into the queue by your web server, and then the web server is free to go back to serving more requests. When the system has some resources, it uses them to process the queue, but when it's busy the queue just keeps growing.
Not sure what platform you're on, but might want to look at something like Lighttpd for serving the POSTs. You might (if same-domain restrictions don't shoot you down) get away with having Lighttpd running on a subdomain of your application (so post.myapp.com). Failing that you could put a proper load balancer in front of your webservers altogether (so all requests go to www.myapp.com and the load balancer decides whether to forward them to the web server or the queue processor).
Hope that helps
Consider using MongoDB for persisting your requests, it's fire and forget mechanism can help your servers to response faster.
I'd like some opinions on the practical implications of moving processing that would traditionally be done on the server to be handled instead by the client in a node.js web app.
Example case study:
The user uploads a CSV file containing a years worth of their bank statement entries. We want to parse the file, categorise each entry and calculate cumulative values for each category so that we can store the newly categorised statement in a db and display spending analysis to the user.
The entries are categorised by matching strings in the descriptions. There are many categories and many entries and it takes a fair amount of time to process.
In our node.js server, we can happily free up the event loop whilst waiting for network responses and so on, but if there is any data crunching or similar processing, the server will be blocked from responding to requests, and this seems unavoidable.
Traditionally, the CSV file would be passed to the server, the server would process, save in db, and send back the output of the processing.
It seems to make sense in our single threaded node.js server that this processing is handled by the browser, and the output displayed and sent to server to be stored. Of course the client will have to wait while this is done, but their processing will not be preventing the server from responding to requests from other clients.
I'm interested to see if anyone has had experience build apps using this model.
So, the question is.. are there any issues in getting browsers rather than the server to handle, wherever possible, any processing that will block the event loop? Is this a good/sensible/viable approach to node.js application development?
I don't think trusting client processed data is a good idea.
Instead you should look into creating a work queue that a separate process listens on, separating the CPU intensive tasks from your node.js process handling HTTP requests.
My proposed data flow would be:
HTTP upload request
App server (save raw file somewhere the worker process can access)
Notification to 'csv' work queue
Worker processes uploaded csv file.
Although perfectly possible, simply shifting the processing to the client machine does not solve the basic problem.
Now the client's event loop is blocked, preventing the user from interacting with the browser. Browsers tend to detect this problem and stop execution of the page's script altogether. Something your users will certainly hate.
There is no way around either delegating or splitting up the work-load.
Using a second process (for example a 2nd node instance) for doing the number crunching server-side has the added benefit of allowing the operating system to use a 2nd CPU core. Ideally you run as many Node instances as you have CPU cores in the server and balance your work-load between them. Have a look at the diode module for some inspiration on how to implement multi-process communication in node.