Can I specify a minimum CPU power for one thread

Can I specify a minimum CPU power for one thread - javascript

I've just started to use the npm threads package to move parts of my node server code into separate threads.
What I'm curious about is if I can specify that a certain thread gets a minimum share of the CPU that other threads aren't allowed to steal from.
My worry is that if my main loop spawns separate threads of intensive algorithms in order to prevent the intensive code from blocking other server operations, will the other thread end up hogging so much CPU that the main loop still significantly slows down anyway?
I'd want to set a minimum CPU for the main loop so that it can still do its stuff lag free regardless of the other threads i've spawned.

Related

Do processes get slower as amount of free memory diminishes?

I've always been under the impression that as long as you have free memory, whether its 100% or 10%, the speed of your processes should not be affected.
However, I recently ran into a situation where it seems that my processes get a lot slower when it uses up a greater percentage of the memory available.
It could be a problem with the code itself, but I'm hoping to get a quick sanity check that I haven't been living a lie before delving deeper into the code iteself.

It all really depends upon how the app is coded and what it is doing. For some apps, it won't make any difference whether free memory is 10% or 100% as long as there's enough for it to do its job.
For other apps, they may encounter memory fragmentation, they may cause disk swapping, they may even adjust their own behavior because of less available memory (using smaller buffers, forcing data to disk, etc...). In a garbage collected system (like nodejs), a lower memory condition may cause more frequent garbage collection too.
The single biggest performance impact from running lower on memory will be if the app causes the OS to page memory to disk. This is where the virtual memory being used exceeds the actual physical memory and the OS has to substitute some disk space for memory that is allocated. The OS tries to swap memory to disk that hasn't been accessed recently in the hopes that it won't be needed again soon, but sometimes that just doesn't work very efficiently and you get a lot of hard disk thrashing, constantly reading/write memory to/from disk. Since disks are thousands of times slower than physical memory, this can massively slow things down.
There are also cases of app design where some operations in an app like Photoshop that will simply run faster with more memory available to use because the algorithms will adapt to use the larger amount of memory to make the operation run faster when working on large objects. A nodejs app or library could be doing the same thing. For example an image processing algorithm may be designed to work on images larger than will fit in memory so it has to decide how much memory is "safe" to allocate and then work on the image in chunks. With a smaller amount of memory available, the work gets done less efficiently in smaller chunks.
A more common reason why things get slower over time is because of some sort of internal fragmentation or leaks that make regular housekeeping chores (like allocating memory) less efficient. This may occur either at the heap level or at the app level. This is why some admins schedule long running processes (like servers) to be automatically restarted every once in awhile - to clear up any of this fragmentation or small leaks and regularly start afresh.
If it's a major problem, extensive debugging may be able to explain where any major impacts are coming from, but this is not trivial debugging as it involves lots of measuring, gathering data, adjusting what you're looking at based on what you find, etc... all while trying to not influence the very thing you're trying to find/measure.

How come Node.js is faster when it only uses limited threads?

I have been learning Node.js, but I have one question for which I cannot find answer anywhere. Here is what I get about Node.js=>
It is single-threaded in its architecture and uses CPU utilization efficiently due to its asynchronous non-blocking event-based looping.
How it executes these asynchronous requests is with the help of in-built library libuv, which uses threads(4 threads by default) in its internal thread pool. All these is kept away from "main" master thread which node.js uses. So we do not have to worry about that.
However, here is what my question - Suppose there are 100 asynchronous requests (let's say files) at once. Since the no. of threads libuv uses is limited, how exactly can node.js handle these 100 asynchronous requests at a time? It should ideally have 100 threads to handle these 100 asynchronous requests to respond the data back to the event queue quickly. How exactly is this faster than multi-threaded process?

How exactly is this faster than multi-threaded process?
The simple answer is, sometimes it isn't. No platform/language/compiler is best for every conceivable scenario.
However, sometimes it is faster. Dealing with many threads has its own problems (e.g. threads sharing CPU cores, thread deadlocking, race conditions, etc). In some cases node's approach is faster, because it doesn't have all that overhead to deal with those issues. In other cases it might not be faster.
That being said, there are things you can do (e.g. worker threads) that can allow you to tailor node.js to your circumstances, if you find you are CPU limited. This is fairly common on web servers, to have as many worker threads as CPU cores. (or cores minus 1, to leave a core free for the OS etc)

Does CPU time slicing work on Node.js with or without worker thread if there's only 1 cpu core?

Since Node.js 10.5, they introduced the new worker thread which makes Node.js a multi-thread environment.
Previously, with only one thread on Node.js, there's no cpu time slicing happening because of the event driven nature (If I understand correctly).
So now multiple threads on Node with with one physical cpu core, how do they share the cpu? is it the OS scheduler schedule time for each thread to run for various amount of time or what?

Worker threads are advertised as
The Worker class represents an independent JavaScript execution thread.
So something like starting another NodeJS instance but within the same process and with a bare minimum of a communication channel.
Worker threads in NodeJS mimick the Worker API in the modern browsers (not a coincidence, NodeJS is basically a browser without UI and with a few extra JS API) and in that context worker threads are really native threads, scheduled by the OS.
The description quoted above seems to imply that in NodeJS too, worker threads are implemented with native threads rather than with a scheduling managed by NodeJS.
The latter would be useless as this is exactly what the JS event loop coupled with async methods do.
So basically a worker thread is just another "instance" (context) of NodeJS run by another native thread in the same process.
Being a native thread, it is managed and scheduled by the OS. And just like you can run more than one program in a single CPU you can do that with threads (fun fact: in many OSes, threads are the only schedulable entities. Programs are just a group of thread with a common address space and other attributes).
As NodeJS is open source, it is easy to confirm this, see the Worker::StartThread and the Worker::Run functions.
The new thread will execute JS code just like the main one but it has been limited in the way it can interact with the environment (particularly the process itself).
This is in line with the JS approach to multithreading where it is more of "two or more message loops" than real multithreading (where threads are free to interact with each other with all the implication at the architectural level).

Node.js and fragmentation

Background: I came from Microsoft world, in which I used to have websites stored on IIS. Experience taught me to recycle my application pool once a day in order to eliminate weird problems due to fragmentation. Recycling the app pool basically means to restart your application without restarting the entire IIS. I also watched a lecture that explained how Microsoft had reduced the fragmentation a lot in .Net 4.5.
Now, I'm deploying a Node.js application to production environment and I have to make sure that it works flawlessly all the time. I originally thought to make my app restarted once a day. Then I did some research in order to find some clues about fragmentation problems in Node.js. The only thing I've found is a scrap of paragraph from an article describing GC in V8:
To ensure fast object allocation, short garbage collection pauses, and
the “no memory fragmentation V8” employs a stop-the-world,
generational, accurate, garbage collector.
This statement is really not enough for me to give up building a restart mechanism for my app, but on the other hand I don't want to do some work if there is no problem.
So my quesion is:
Should or shouldn't I restart my app every now and then in order to prevent fragmentation?

Implementing a server restart before you know that memory consumption is indeed a problem is a premature optimization. As such, I don't think you should do it until you actually find that it is a problem. You will likely find more important issues to optimize for as opposed to memory consumption.
To figure out if you need a server restart, I recommend doing the following:
Set up some monitoring tools like https://newrelic.com/ that let's your monitor your performance.
Monitor your memory continuously. Try to see if there is steady increase in the amount of memory consumed, or if it levels off.
Decide upon an acceptable threshold before you need to act. For example once your app consumes 60% of system memory you need to start thinking about a server restart and decide upon the restart interval.
Decide if you are ok with having "downtime" while restarting the sever or not. If you don't want downtime, you may need to build a proxy layer to direct traffic.
In general, I'd recommend server restarts for all dynamic, garbage collected languages. This is fairly common in those types of large applications. It is almost inevitable that a small mistake somewhere in your code base, or one of the libraries you depend on will leak memory. Even if you fix one leak, you'll get another one eventually. This may frustrate your team, which will basically lead to a server restart policy, and a definition of what is acceptable in regards to memory consumption for your application.

I agree with #Parris. You should probably figure out whether you actually need have a restart policy first. I would suggest using pm2 docs here. Even if you don't want to sign up for keymetrics, its a pretty good little process manager and real quick to set up. You can get a report of memory usage from command line. Looks something like this.
Also, if you start in cluster mode like above, you can call pm2 restart my_app and the first one will probably be up again before the last one is taken offline (this is an added benefit, the real reason for having 8 processes is to utilize all 8 cores). If you are adamant about downtime, you could restart them 1 by 1 acording to id.

I agree with #Parris this seems like a premature optimization. Also, restarting is not a solution to the underlying problem, it's a treatment for the symptoms.
If memory errors are a prevalent issue for your node application then I think that some thought as to why this fragmentation occurs in your program in the first place could be a valuable effort. Understanding why memory errors occur after a program has been running for a long period of time, and refactoring the architecture of your program to solve the root of the problem, is a better solution in my eyes than just addressing the symptoms.
I believe two things will benefit you.
immutable objects will help a lot, they are a lot more predictable than using mutable objects, and will not be affected by the length of time the project has been live. Also, since immutable objects are read only blocks of memory they are faster than mutable objects which the server has to spend resources deciding whether to read, or write on the memory block which stores the object. I currently use the library called IMMUTABLE and it works well for me. There are other one's as well like Deep Freeze, however, I have never used it.
Make sure to manage your application's processes correctly, memory leaks are the second big contributor to this problem that I have experienced. Again, this is solved by thinking about how your application is structured, and how user events are handled, making sure once a process is not being used by the client that it is properly removed from the heap, if it is not then the heap keeps growing until all memory is consumed causing the application to crash(refer to the below graphic to see V8's memory Scheme, and where the heap is). Node is a C++ program, and it's controlled by Google's V8 and Javascript.
You can use Node.js's process.memoryUsage() to monitor memory usage. When you identify how to manage your heap V8 offers two solutions, one is Scavenge which is very quick, but incomplete. The other is Mark-Sweep which is slow and frees up all non-referenced memory.
Refer to this blog post for more info on how to manage your heap and manage your memory on V8 which runs Node.js
So the responsible approach to your implementation is to keep a close eye on open processes, a deep understanding of the heap, and how to free non-referenced memory blocks. Creating your project with this in mind also makes the project a lot more scaleable as well.

Why does web worker performance sharply decline after 30 seconds?

I'm trying to improve the performance of a script when executed in a web worker. It's designed to parse large text files in the browser without crashing. Everything works pretty well, but I notice a severe difference in performance for large files when using a web worker.
So I conducted a simple experiment. I ran the script on the same input twice. The first run executed the script in the main thread of the page (no web workers). Naturally, this causes the page to freeze and become unresponsive. For the second run, I executed the script in a web worker.
Script being executed
Test runner page
For small files in this experiment (< ~100 MB), the performance difference is negligible. However, on large files, parsing takes about 20x longer in the worker thread:
The blue line is expected. It should only take about 11 seconds to parse the file, and the performance is fairly steady:
The red line is the performance inside the web worker. It is much more surprising:
The jagged line for the first 30 seconds is normal (the jag is caused by the slight delay in sending the results to the main thread after every chunk of the file is parsed). However, parsing slows down rather abruptly at 30 seconds. (Note that I'm only ever using a single web worker for the job; never more than one worker thread at a time.)
I've confirmed that the delay is not in sending the results to the main thread with postMessage(). The slowdown is in the tight loop of the parser, which is entirely synchronous. For reasons I can't explain, that loop is drastically slowed down and it gets slower with time after 30 seconds.
But this only happens in a web worker. Running the same code in the main thread, as you've seen above, runs very smoothly and quickly.
Why is this happening? What can I do to improve performance? (I don't expect anyone to fully understand all 1,200+ lines of code in that file. If you do, that's awesome, but I get the feeling this is more related to web workers than my code, since it runs fine in the main thread.)
System: I'm running Chrome 35 on Mac OS 10.9.4 with 16 GB memory; quad-core 2.7 GHz Intel Core i7 with 256 KB L2 cache (per core) and L3 Cache of 6 MB. The file chunks are about 10 MB in size.
Update: Just tried it on Firefox 30 and it did not experience the same slowdown in a worker thread (but it was slower than Chrome when run in the main thread). However, trying the same experiment with an even larger file (about 1 GB) yielded significant slowdown after about 35-40 seconds (it seems).

Tyler Ault suggested one possibility on Google+ that turned out to be very helpful.
He speculated that using FileReaderSync in the worker thread (instead of the plain ol' async FileReader) was not providing an opportunity for garbage collection to happen.
Changing the worker thread to use FileReader asynchronously (which intuitively seems like a performance step backwards) accelerated the process back up to just 37 seconds, right where I would expect it to be.
I haven't heard back from Tyler yet and I'm not entirely sure I understand why garbage collection would be the culprit, but something about FileReaderSync was drastically slowing down the code.

What hardware are you running on? You may be running into cache thrashing problems with your CPU. For example if the CPU cache is 1MB per core (just an example) and you start trying to work with data continually replacing the cache (cache misses) then you will suffer slow downs - this is quite common with MT systems. This is common in IO transfers too. Also these systems tend to have some OS overheads for the thread contexts as well. So if lots of threads are being spawned you may be spending more time managing the contexts than the thread is 'doing work'. I haven't yet looked at your code, so I could be way off - but my guess is on the memory issue just due to what your application is doing. :)
Oh. How to fix. Try making the blocks of execution small single chunks that match the hardware. Minimize the amount of threads in use at once - try to keep them 2-3x the amount of cores you have in the hardware (this really depends what sort of hw you have). Hope that helps.

Develop Reference

JavaScript is the programming language of the Web.