I'm writing an application that makes heavy use of the http.request method.
In particular, I've found that sending 16+ ~30kb requests simultaneously really bogs down a Nodejs instance on a 512mb RAM machine.
I'm wondering if this is to be expected, or if Nodejs is just the wrong platform for outbound requests.
Yes, this behavior seems perfectly reasonable.
I would be more concerned if it was doing the work you described without any noticeable load on the system (in which case it would take a very long time). Remember that node is just an evented I/O runtime, so you can have faith that it is scheduling your I/O requests (about) as quickly as the underlying system can, hence it's using the system to it's (nearly) maximum potential, hence the system being "really bogged down".
One thing you should be aware of is the fact that http.request does not create a new socket for each call. Each request occurs on an object called an "agent" which contains a pool of up to 5 sockets. If you are using the v0.6 branch, then you can up this limit by using.
http.globalAgent.maxSockets = Infinity
Try that and see if it helps
Related
Since Node.js 10.5, they introduced the new worker thread which makes Node.js a multi-thread environment.
Previously, with only one thread on Node.js, there's no cpu time slicing happening because of the event driven nature (If I understand correctly).
So now multiple threads on Node with with one physical cpu core, how do they share the cpu? is it the OS scheduler schedule time for each thread to run for various amount of time or what?
Worker threads are advertised as
The Worker class represents an independent JavaScript execution thread.
So something like starting another NodeJS instance but within the same process and with a bare minimum of a communication channel.
Worker threads in NodeJS mimick the Worker API in the modern browsers (not a coincidence, NodeJS is basically a browser without UI and with a few extra JS API) and in that context worker threads are really native threads, scheduled by the OS.
The description quoted above seems to imply that in NodeJS too, worker threads are implemented with native threads rather than with a scheduling managed by NodeJS.
The latter would be useless as this is exactly what the JS event loop coupled with async methods do.
So basically a worker thread is just another "instance" (context) of NodeJS run by another native thread in the same process.
Being a native thread, it is managed and scheduled by the OS. And just like you can run more than one program in a single CPU you can do that with threads (fun fact: in many OSes, threads are the only schedulable entities. Programs are just a group of thread with a common address space and other attributes).
As NodeJS is open source, it is easy to confirm this, see the Worker::StartThread and the Worker::Run functions.
The new thread will execute JS code just like the main one but it has been limited in the way it can interact with the environment (particularly the process itself).
This is in line with the JS approach to multithreading where it is more of "two or more message loops" than real multithreading (where threads are free to interact with each other with all the implication at the architectural level).
Background: I came from Microsoft world, in which I used to have websites stored on IIS. Experience taught me to recycle my application pool once a day in order to eliminate weird problems due to fragmentation. Recycling the app pool basically means to restart your application without restarting the entire IIS. I also watched a lecture that explained how Microsoft had reduced the fragmentation a lot in .Net 4.5.
Now, I'm deploying a Node.js application to production environment and I have to make sure that it works flawlessly all the time. I originally thought to make my app restarted once a day. Then I did some research in order to find some clues about fragmentation problems in Node.js. The only thing I've found is a scrap of paragraph from an article describing GC in V8:
To ensure fast object allocation, short garbage collection pauses, and
the “no memory fragmentation V8” employs a stop-the-world,
generational, accurate, garbage collector.
This statement is really not enough for me to give up building a restart mechanism for my app, but on the other hand I don't want to do some work if there is no problem.
So my quesion is:
Should or shouldn't I restart my app every now and then in order to prevent fragmentation?
Implementing a server restart before you know that memory consumption is indeed a problem is a premature optimization. As such, I don't think you should do it until you actually find that it is a problem. You will likely find more important issues to optimize for as opposed to memory consumption.
To figure out if you need a server restart, I recommend doing the following:
Set up some monitoring tools like https://newrelic.com/ that let's your monitor your performance.
Monitor your memory continuously. Try to see if there is steady increase in the amount of memory consumed, or if it levels off.
Decide upon an acceptable threshold before you need to act. For example once your app consumes 60% of system memory you need to start thinking about a server restart and decide upon the restart interval.
Decide if you are ok with having "downtime" while restarting the sever or not. If you don't want downtime, you may need to build a proxy layer to direct traffic.
In general, I'd recommend server restarts for all dynamic, garbage collected languages. This is fairly common in those types of large applications. It is almost inevitable that a small mistake somewhere in your code base, or one of the libraries you depend on will leak memory. Even if you fix one leak, you'll get another one eventually. This may frustrate your team, which will basically lead to a server restart policy, and a definition of what is acceptable in regards to memory consumption for your application.
I agree with #Parris. You should probably figure out whether you actually need have a restart policy first. I would suggest using pm2 docs here. Even if you don't want to sign up for keymetrics, its a pretty good little process manager and real quick to set up. You can get a report of memory usage from command line. Looks something like this.
Also, if you start in cluster mode like above, you can call pm2 restart my_app and the first one will probably be up again before the last one is taken offline (this is an added benefit, the real reason for having 8 processes is to utilize all 8 cores). If you are adamant about downtime, you could restart them 1 by 1 acording to id.
I agree with #Parris this seems like a premature optimization. Also, restarting is not a solution to the underlying problem, it's a treatment for the symptoms.
If memory errors are a prevalent issue for your node application then I think that some thought as to why this fragmentation occurs in your program in the first place could be a valuable effort. Understanding why memory errors occur after a program has been running for a long period of time, and refactoring the architecture of your program to solve the root of the problem, is a better solution in my eyes than just addressing the symptoms.
I believe two things will benefit you.
immutable objects will help a lot, they are a lot more predictable than using mutable objects, and will not be affected by the length of time the project has been live. Also, since immutable objects are read only blocks of memory they are faster than mutable objects which the server has to spend resources deciding whether to read, or write on the memory block which stores the object. I currently use the library called IMMUTABLE and it works well for me. There are other one's as well like Deep Freeze, however, I have never used it.
Make sure to manage your application's processes correctly, memory leaks are the second big contributor to this problem that I have experienced. Again, this is solved by thinking about how your application is structured, and how user events are handled, making sure once a process is not being used by the client that it is properly removed from the heap, if it is not then the heap keeps growing until all memory is consumed causing the application to crash(refer to the below graphic to see V8's memory Scheme, and where the heap is). Node is a C++ program, and it's controlled by Google's V8 and Javascript.
You can use Node.js's process.memoryUsage() to monitor memory usage. When you identify how to manage your heap V8 offers two solutions, one is Scavenge which is very quick, but incomplete. The other is Mark-Sweep which is slow and frees up all non-referenced memory.
Refer to this blog post for more info on how to manage your heap and manage your memory on V8 which runs Node.js
So the responsible approach to your implementation is to keep a close eye on open processes, a deep understanding of the heap, and how to free non-referenced memory blocks. Creating your project with this in mind also makes the project a lot more scaleable as well.
I'm interested in building a small real-time multiplayer game, using HTML5/JavaScript for the client and probably Java for the server software.
I looked into WebSockets a bit, but it appears I had misconceptions on what WebSockets actually are. I had initially thought of WebSockets as just JavaScript's way of handling TCP sockets, just as they are used in Java and other languages, but it appears there is a whole handshaking process that must take place, and each transmission includes much HTTP overhead (and in that case, the benefits over Ajax do not seem as exciting as at a first glance)?
On a related topic, are there any better alternatives to WebSockets for this purpose (real-time multiplayer games in JavaScript)?
WebSockets are the best solution for realtime multiplayer games running in a web browser. As pointed out in the comments there is an initial handshake where the HTTP connection is upgraded but once the connection is established WebSockets offer the lowest latency connection mechanism for bi-directional communication between a server and a client.
I'd recommend you watch this: https://www.youtube.com/watch?v=_t28OPQlZK4&feature=youtu.be
Have a look at:
http://browserquest.mozilla.org/ code available here: https://github.com/mozilla/BrowserQuest
https://chrome.com/supersyncsports/
The only raw TCP solution would be to use a plugin which supports some kind of TCPClient object. I'd recommend you try out WebSockets.
You can find a number of options here. Just search for WebSockets within the page.
Also take a look at WebRTC. Depending on the purpose of your game and whether you need your server to manage game state, you could use this technology for peer-to-peer communication. You may still need a solution to handle putting players into groups - in that case WebSockets is the fastest/best solution.
Basically, you have 3 options at the time of this writing:
WebSockets
WebSockets is a lightweight messaging protocol that utilizes TCP, rather than a Javascript implementation of TCP sockets, as you've noted. However, beyond the initial handshake, there are no HTTP headers being passed to and fro beyond that point. Once the connection is established, data passes freely, with minimal overhead.
Long-polling
Long-polling, in a nutshell, involves the client polling the server for new information periodically with HTTP requests. This is extremely expensive in terms of CPU and bandwidth, as you're sending a hefty new HTTP header each time. This is essentially your only option when it comes to older browsers, and libraries such as Socket.io use long-polling as a fallback in these cases.
WebRTC
In addition to what has been mentioned already, WebRTC allows for communication via UDP. UDP has long been used in multiplayer games in non web-based environments because of its low overhead (relative to TCP), low latency, and non-blocking nature.
TCP "guarantees" that each packet will arrive (save for catastrophic network failure), and that they will always arrive in the order that they were sent. This is great for critical information such as registering scores, hits, chat, and so on.
UDP, on the other hand, has no such guarantees. Packets can arrive in any order, or not at all. This is actually useful when it comes to less critical data that is sent at a high frequency, and needs to arrive as quickly as possible, such as player positions or inputs. The reason being that TCP streams are blocked if a single packet gets delayed during transport, resulting in large gaps in game state updates. With UDP, you can simply ignore packets that arrive late (or not at all), and carry on with the very next one you receive, creating a smoother experience for the player.
At the time of this writing, WebSockets are probably your best bet, though WebRTC adoption is expanding quickly, and may actually be preferable by the time you're done with your game, so that's something to consider.
I'm not sure if WebSockets are still the best tool for networking
a real-time multiplayer these days (2017). WebRTC is a newer technology
which offers the potential of much higher performance. And these
days, WebRTC is also easier to work with thanks to the following libraries:
node-webrtc simplifies server-side networking
webrtc-native which also provides a server-side library, and could be faster as its name suggests
electron-webrtc provides an implementation which is a good match if you want to package your game using electron
Alternatively, if you want to be spared the actual details of networking implementation, and you're looking for a library which provides a higher-level multiplayer interface, take a look at Lance.gg. (disclaimer: I am one of the contributors).
Multiplayer games requires the server to send periodic snapshots of the world state to the client. In the context of a browser HTML/js application you have little choices: polling, websocket or write your own plugin to extend browser capabilities.
The HTTP polling such as BOSH or Bayeux are sophisticated but introduces network overhead and latency. The websocket was designed to overcome their limitation and is definitely more responsive.
Libraries, such as cometd or socket io, provide an abstraction of the transport and solve the browser compatibility issues for you. On top of that, it allows to switch between the underlying transports and compare their performance without effort.
I coded multiplayer arcade game with socket.io and usual measure 2ms latency with a websocket and around 30ms with xhr-polling on lan. It's enough for a multiplayer games.
I suggest you to have a look to nodejs and socket.io in order to be able to share code between the client and the server, you also car borrow some multiplayer code at [3].
If you are planing to use JavaScript for your game (as you are) then WebSocket is the best choice for you. And if you want to support older version of Internet Explorer then think of Signal R system Microsoft developed. They are using WebSocket under the hood, but they also have a few fall back options...so protocol will use the best available solution available.
http://signalr.net/
I have a problem using PhantomJS with web server module in a multi-threaded way, with concurrent requests.
I am using PhantomJS 2.0 to create highstock graphs on the server-side with Java, as explained here (and the code here).
It works well, and when testing graphs of several sizes, I got results that are pretty consistent, about 0.4 seconds to create a graph.
The code that I linked to was originally published by the highcharts team, and it is also used in their export server at http://export.highcharts.com/. In order to support concurrent requests, it keeps a pool of spawned PhantomJS processes, and basically its model is one phantomjs instance per concurrent request.
I saw that the webserver module supports up to 10 concurrent requests (explained here), so I thought I can tap on that to keep a lesser number of PhantomJS processes in my pool. However, when I tried to utilize more threads, I experienced a linear slow down, as if PhantomJS was using only one CPU. This slow-down is shown as follows (for a single PhantomJS instance):
1 client thread, average request time 0.44 seconds.
2 client threads, average request time 0.76 seconds.
4 client threads, average request time 1.5 seconds.
Is this a known limitation of PhantomJS? Is there a way around it?
(question also posted here)
Is this a known limitation of PhantomJS?
Yes, it is an expected limitation, because PhantomJS uses the same WebKit engine for everything and since JavaScript is single-threaded, this effectively means that every request will be handled one after the other (possibly interlocked), but never at the same time. The average overall time will increase linearly with each client.
The documentation says:
There is currently a limit of 10 concurrent requests; any other requests will be queued up.
There is a difference between the notions of concurrent and parallel requests. Concurrent simply means that the tasks finish non-deterministically. It doesn't mean that the instructions that the tasks are made of are executed in parallel on different (virtual) cores.
Is there a way around it?
Other than running your server tasks through child_process, no. The way JavaScript supports multi-threading is by using Web Workers, but a worker is sandboxed and has no access to require and therefore cannot create pages to do stuff.
I'm working on chat application. The underlying server uses Node.js and the client/server communication goes via WebSockets.
So the question is: how many simultaneous connections can such a server handle (without visible lags)? Of course approximately and assuming that the server is very powerful machine. I know this is not an easy question to answer but I just want ideas, some approximations... or at least upper and lower bounds. Of course I'm going to do some practical tests, but the theory may help me a bit.
Also I have another question related to the first one: is it possible to split Node.js applications into multiple machines? Keep in mind, that most of the data is held in machines memory rather then database.
Waiting for replies. :)
You'll want to make sure you run node.js on top of epoll/kqueue and tune your OS for high TCP connection numbers.
Here are a couple of measured numbers for a publish/subscribe system based on Autobahn WebSockets:
180k concurrent, active WebSocket connections
12k/s dispatched pubsub messages
4k/s WebSocket opening handshakes
<8kB per WebSocket connection
This is on a FreeBSD i386 virtual machine configured with 2 cores and 2GB RAM.
Autobahn WebSockets is Python/Twisted based and runs on a kqueue reactor.
I've made a simple benchmark of a multiroom chat application. Just remember to monitor the CPU% when running it, it's possible that the benchmark app will drain the CPU faster than the server!