High CPU Utilization for Meteor.js - javascript

A meteor.js 0.82 app is running on an Ubuntu 14.04 server with 2GB memory and 2 cpu cores. It was deployed using mup. However the CPU utilization is very high, htop reports 2.72 load average.
Question: How do I find out which part of the app is causing such a high CPU utilization? I used Kadira but it does not reveal anything taking up alot of CPU load afaik.
Does Meteor only use a single core?

I had a similar problem before with Meteor 0.8.2-0.8.3. Here are what I have done to reduce the CPU usage, hope you may find it useful.
double check your functions, ensure all function has proper return, and does properly catch errors
try to use a replicaSet and oplog mongo convert standalone to replica set
write scripts to auto kill and resprawn a node process if it exceeds 100% cpu usage
utilize multi-core capability by starting 2 processes (edit you have done already) and configure and setup load-balance and reverse proxy
make sure to review your publish and subscription and limit what data to be sent to client (simply avoid something like Collection.find();)
Personally I recommend Phusion Passenger, it makes deploying Meteor applications an ease, and I have used it for several projects without any major problems.
One more thing, avoid running the processes in root (or privilege user), you should be running your apps in another user like www-data. This is for obvious security reason.
P.S. and multiple mongo processes showing in htop are threads under a master process, you can view it in tree mode by pressing F5.

Related

Load testing of Web UI for 20K users

I have developed automation scripts for end to end workflows using selenium webdriver io for my web application. I want to use same scripts for load testing. My requirement is to test web UI along with backend APIs for 20K users. Can you please suggest how I can achieve it with Selenium?
For 20K users you will need 20k browsers, for 20k browsers you will need I don't know how many in reality, but looking into Firefox 87 system requirements:
a CPU core per browser instance
2 GB of RAM per browser instance
If you have such a supercomputer somewhere or possess a budget to kick off that many machines in cloud - you can scale your Selenium tests using i.e. K8S cluster
However it might be a better idea considering converting your Selenium tests into HTTP-protocol-based load tests, the majority of load testing tools provide HTTP Proxy server for recording tests so you can replay your Selenium tests via this proxy and get them converted into a load testing tool script.
Protocol-based tests have much smaller footprint in terms of CPU and RAM so you should be able to use reasonably small hardware for conducting your tests.

PhantomJS with embedded web server uses only one CPU

I have a problem using PhantomJS with web server module in a multi-threaded way, with concurrent requests.
I am using PhantomJS 2.0 to create highstock graphs on the server-side with Java, as explained here (and the code here).
It works well, and when testing graphs of several sizes, I got results that are pretty consistent, about 0.4 seconds to create a graph.
The code that I linked to was originally published by the highcharts team, and it is also used in their export server at http://export.highcharts.com/. In order to support concurrent requests, it keeps a pool of spawned PhantomJS processes, and basically its model is one phantomjs instance per concurrent request.
I saw that the webserver module supports up to 10 concurrent requests (explained here), so I thought I can tap on that to keep a lesser number of PhantomJS processes in my pool. However, when I tried to utilize more threads, I experienced a linear slow down, as if PhantomJS was using only one CPU. This slow-down is shown as follows (for a single PhantomJS instance):
1 client thread, average request time 0.44 seconds.
2 client threads, average request time 0.76 seconds.
4 client threads, average request time 1.5 seconds.
Is this a known limitation of PhantomJS? Is there a way around it?
(question also posted here)
Is this a known limitation of PhantomJS?
Yes, it is an expected limitation, because PhantomJS uses the same WebKit engine for everything and since JavaScript is single-threaded, this effectively means that every request will be handled one after the other (possibly interlocked), but never at the same time. The average overall time will increase linearly with each client.
The documentation says:
There is currently a limit of 10 concurrent requests; any other requests will be queued up.
There is a difference between the notions of concurrent and parallel requests. Concurrent simply means that the tasks finish non-deterministically. It doesn't mean that the instructions that the tasks are made of are executed in parallel on different (virtual) cores.
Is there a way around it?
Other than running your server tasks through child_process, no. The way JavaScript supports multi-threading is by using Web Workers, but a worker is sandboxed and has no access to require and therefore cannot create pages to do stuff.

Segmentation Fault during high load concurrency test with WebWorker Threads

So, I'm trying to conduct a test to see how much WebWorker Threads (https://github.com/audreyt/node-webworker-threads) can improve CPU intensive tasks with NodeJS in a multi core system.
I actually got this working on a VM with a single core assigned at work, but when I tried it on my home VM with 4 cores, I'm getting a Segmentation Fault after 15-20 requests.
I've got my project up at https://github.com/WakeskaterX/NodeThreading.git
I have tried eliminating pieces to see why I'm getting the SegFault, but even just returning static numbers throws the SegFault after 15-20 requests.
For the loadtest command I'm running:
loadtest -c 4 -t 20 http://localhost:3030/fib?num=30
It runs just fine when it's synchronously calculating the Fibonacci sequence, but as soon as it hits a web worker it Segmentation Fault Core Dumps. Perhaps this is related to the WebWorker-Threads code on the back end, but I'm mainly wondering why it's happening and how I can debug it further or fix it so I can test background threading in nodejs.
this is a variable lifetime issue — In general, a long-running worker needs to be assigned into an object, instead of a lexical variable; the latter is garbage-collected away when the scope exits.
See https://github.com/WakeskaterX/NodeThreading/pull/1 for the pull request that fixes the issue.

How does Nodejs performance scale when using http.request?

I'm writing an application that makes heavy use of the http.request method.
In particular, I've found that sending 16+ ~30kb requests simultaneously really bogs down a Nodejs instance on a 512mb RAM machine.
I'm wondering if this is to be expected, or if Nodejs is just the wrong platform for outbound requests.
Yes, this behavior seems perfectly reasonable.
I would be more concerned if it was doing the work you described without any noticeable load on the system (in which case it would take a very long time). Remember that node is just an evented I/O runtime, so you can have faith that it is scheduling your I/O requests (about) as quickly as the underlying system can, hence it's using the system to it's (nearly) maximum potential, hence the system being "really bogged down".
One thing you should be aware of is the fact that http.request does not create a new socket for each call. Each request occurs on an object called an "agent" which contains a pool of up to 5 sockets. If you are using the v0.6 branch, then you can up this limit by using.
http.globalAgent.maxSockets = Infinity
Try that and see if it helps

Performance testing on node.js "net"

Does anyone have any recommendations on how to get started with node.js "net" performance testing?
I want to see how my app will scale and want to test 10,000+ concurrent connections!
EDIT: I want to know so I can see if my Ubuntu server configs are correct, etc.
Professional performance testing tools are agnostic to your underlying technology (node.js / .NET), and see just the output (HTTP Requests and responses), so any tools can do.
There's HP's LoadRunner and a lot of others. I have used WebLOAD, which is more cost effective, and a bit easier to use.
10,000 concurrent connections. Hmmm. I would think that such a load would have to be tied to a user population for your app somewhere in the 500,000-2,000,000 range with a 2% to .5% level of concurrency respectively. If this was an internal facing corporate app then your user population expectations would be somewhere in the 83,333(12%) - 125,000 (8%). These concurrency models come from 15 years of observations in corporate and internet facing applications for levels of concurrency vs the defined user population for a given application facing model (internal corporate vs public internet).
The reason why I bring up the above is that you may be over stressing your component for its defined use and as a result you could have some engineering ghosts that you chase down to fix. This can impact your budget and availability to hit other issues that may show up in production use.
Just food for thought,
James Pulley
From the video it seems that memory usage doesn't budge because it doesn't spawn new processes, which is precisely the reason it has picked up a huge following. That's what event driven/non blocking can do

Categories

Resources