Why is LIBUV needed in Node JS?

Why is LIBUV needed in Node JS? - javascript

So, maybe this question is too noob and novice to be asked but I still have no clue why LIBUV got a place in Node JS Architecture? So here is my understanding of NodeJs architecture.
Node Js is built over V8
V8 is capable of running code written with EcmaScript standards.
V8 is written in C++.
So if you want to give any new functionality we can embed V8 in our C++ project and attach new code with new Embedded V8 in C++.
Now here is the doubt,
Since V8 supports EcmaScript Javascript that means it has the capability to run callbacks written with the standards of EcmaScript.
So we can add code for File System Access, HTTP server & DB access in C++ since there are libraries (header files) that gives that functionality since Java is written in C++ (correct me if I am wrong) and Java has the capability to do the same.
Now if we can add this functionality in C++ where does the place for Libuv come into the picture of NodeJs architecture.
Thanks in advance and
Happy Coding :)

Check the docs below -
https://nodejs.org/en/docs/meta/topics/dependencies/#libuv
Another important dependency is libuv, a C library that is used to
abstract non-blocking I/O operations to a consistent interface across
all supported platforms. It provides mechanisms to handle file system,
DNS, network, child processes, pipes, signal handling, polling and
streaming. It also includes a thread pool for offloading work for some
things that can't be done asynchronously at the operating system
level.
So to sum it up, V8 provides the functionalities related to running JS files, but to use system resources like Network, Files, etc., libuv is used. Also it provides a threading model for accessing the resources mentioned.

The libuv module has a responsibility that is relevant for some particular functions in the standard library. for SOME standard library function calls, the node C++ side and libuv decide to do expensive calculations outside of the event loop entirely.They make something called a thread pool that thread pool is a series of four threads that can be used for running computationally intensive tasks such as hashing functions.
By default libuv creates four threads in this thread pool. So that means that in addition to that thread used for the event loop there are four other threads that can be used to offload expensive calculations that need to occur inside of our application. Many of the functions include in the node standard library will automatically make use of this thread pool.
Now the presence of this thread pool is very significant. Well clearly Node.js is not truly single threaded
Libuv also gives node access to the operating system’s underlying file system such as networking. So just as the node standard library has some functions that make use of libuv thread pool it also has some functions that make use of code that is built into the underlying operating system through libuv.
Simple Http request
const https=require(“https”)
const start=Date.now()
https.request(“https://www.google.com”,res=>{
res.on(“data”,()=>{} )
res.on(“end”,()=>{console.log(Date.now()-start) }) }).end()
So in this case libuv sees that we are attempting to make an HTTP request. Neither libuv nor node has any code to handle all of this low level operations that are involved with a network request. Instead libuv delegates the request making to the underlying operating system. So it's actually our operating system that does the real HTTP request Libuv is used to issue the request and then it just waits on the operating system to emit a signal that some response has come back to the request. So because Libuv is delegating the work done to the operating system the operating system itself decides whether to make a new threat or not. Or just generally how to handle the entire process of making the request.

If anyone stumbles upon this and since it lacks a good answer to the OP's question, I will try to take on this.
TLDR;
Javascript language is not asynchronous
Javascript language is not multi-threaded
Callbacks themselves are not asynchronous, they are just mean to piggyback your code to an asynchronous operation.
Let's go over your doubts one by one.
1. Since V8 supports EcmaScript Javascript that means it has the capability to run callbacks written with the standards of EcmaScript.
Callbacks don't mean that the operation is asynchronous. A callback has got nothing to do with asynchronous execution. Callback is just a way to piggyback your function so that it executes after 'something asynchronous'.
// example of synchronous callback
function main(cb) {
console.log('main code of the function');
cb(); // callback invocation here
}
main(function () {
console.log('in callback');
});
Now an example of asynchronous callback
function getDataFromNetwork(url, cb) {
ajaxCall(url).then(cb);
}
getDataFromNetwork('http://some-endpoint', function (data) {
console.log(data);
});
This is an asynchronous call with a callback. Here getDataFromNetwork function is asynchronous not the callback. The point is that callbacks are just a mechanism of running a code after something. In an asynchronous operation, this becomes a necessity. How else we are going to do that? right?
No!
Nowadays we have async-await where you can run a code after the asynchronous function completes without using callbacks.
So you get that? Callbacks are not asynchronous. And that's not the point of having libuv.
2. So we can add code for File System Access, Http server & DB access in C++ since there are libraries (header files) that gives that functionality since Java is written in C++ (correct me if I am wrong) and Java has the capability to do the same.
Yes we can add lots of code for File System Access, Http server. But Why? We do already have a lot of libraries to do that. And yes its already written in C thats how NodeJS executes them.
Java already has that?
Right, but thats also a part of JVM rather than the core Java language, just like libuv is part of NodeJS runtime rather than the core Javascript language. In this regard both Java and NodeJS are similar. Its just that Java has its own C++ layer and NodeJS borrows libuv for that. BTW libuv was primarily built for NodeJS
3. Now if we can add these functionality in C++ where does the place for Libuv come in to the picture of NodeJs architecture.
I answered how these functionalities are already in C++, now lets see where libuv fits in this picture of the whole architecture.
Lets take an ajax/network call for example. Who do you think executes this?
NodeJS? No, It just gives instruction to its C++ API (Node API).
then is it Node API? No, It just gives instruction to the libuv
then is it libuv? Yes, it is
Same goes for timers, file access, child processes etc.
Also think when a lot of network calls, file access are fired within a NodeJS program, on what process it runs? who schedules them? who notify about the results and failure.
This is a lot to do. Java has its own thread pool to do that. Java has its own schedular to schedule the threads. and since Java provides threads to end user(programmers) as well. It makes sense to implement all that stuff using Java threads.
But NodeJS is single-threaded. Why it should have threads to execute I/O operations when it can borrow it from another library without making them a part of Javascript? After all, we aren't going to provide threads to the programmer so why bother?
Also Historically, Javascript was only meant to run in browsers. The only asynchronous operation browsers had access to were network requests, no file access, no DB. So we did not have a lot of bedrock already to build upon.

Related

Is PURE Javascript synchronous or asynchronous?

From my understanding javascript will either run in the browser or as a backend in Node.js.
The browser or Node.js, depending on where you run your javascript, will via web API's or c++ API's handle functions that block the runtime (i.e network calls, image rendering, etc), then send them to the event loop and eventually merge them into the single thread that javascript runs on.
What I don't understand is, when I google "is javascript synchronous or asynchronous", the answer is javascript is asynchronous.
But is that true? Javascript is asynchronous because of the web API's or c++ API's in the browser or Node.js backend, that makes threads under the hood, but javascript itself isn't asynchronous then?
If javascript only has one thread it must be a synchronous language?

Javascript (as implemented in the browser and in node.js) is an event driven system.
That means that it works best when used with non-blocking, asynchronous I/O that gives you the best experience and features in coordination with the event driven system. This isn't necessarily inherent in Javascript the language itself (you could make a version of Javascript that had nothing but blocking I/O), but all the popular implementations of Javascript depend upon an event queue and depend upon asynchronous I/O working in coordination with the event queue to offer a useful programming environment.
Until recently, Javascript also didn't have useful threads and useful thread synchronization tools to make a multi-threaded system with blocking I/O practical or useful. There are now threads in both the browser and node.js, though the threads in node.js are pretty heavy-handed (a whole new instance of the V8 interpreter, separate heap, etc...) so they would not necessarily be performance competitive with systems that have threads built in as more of an inherent feature. Plus the thread synchronization tools in Javascript are fairly early in their development.
What I don't understand is, when I google "is javascript synchronous or asynchronous", the answer is javascript is asynchronous.
Current popular implementations of Javascript are in environments that require asynchronous I/O in order to be productive. It's not necessarily required in the pure language all by itself, though I don't know of any implementations that assume threads and blocking I/O.
Javascript is asynchronous because of the web API's or c++ API in the browser are Node.js backend that makes threads under the hood, but javascript itself isn't asynchronous then?
A Javascript environment has asynchronous capabilities because Javascript is paired with an event driven environment and is paried with asynchronous operations such as timers and I/O. So, the combination of the Javascript implementation and the other things the environment adds to it make an environment capable of writing code that can use asynchronous features. Please don't too hung up on the semantic argument about whether Javascript is or isn't asynchronous itself. As best I know, the ECMAScript specification that specifies the Javascript language doesn't necessarily require that. I think there could exist an implementation of the pure Javascript language with no asynchronous capabilities. But, most of what you read on the web or in books will refer to "Javascript" when what they really mean are the popular implementations of Javascript such as in a web browser or in node.js. And, frankly, that's mostly what is relevant since that's where you can actually use Javascript unless you're going to build your own custom environment.
If javascript only has one thread it must be a synchronous language?
It's not entirely clear what you mean by this question. By default (without invoking webWorkers or Worker Threads) Javascript runs your Javascript code in one single thread, but it has access to non-blocking I/O functions that allow operations to run in parallel with your Javascript. In a browser, you can make an Ajax call to your server, then go do something else while that Ajax call is finishing (make some calculations, update the screen, update a clock on screen, etc...) and then when a completion notification arrives from the Ajax call, you can process the results. While your actual lines of Javascript were run one after another synchronously, you were allowed to start asynchronous operations and thus run some things in parallel with your Javascript execution. I will avoid debating whether one wants to call it a "synchronous language" or not. That's just a semantic argument. It works the way it works, running your Javascript in a single thread, but taking advantage of native OS capabilities to run other things in parallel with the Javascript (like network operations).

is it possible to achieve multithreading in nodejs? [duplicate]

This question already has answers here:
How to create threads in nodejs
(12 answers)
Closed 6 years ago.
Node.js multithreading.
Is it possible to use multithreading in Node.js? if yes.
What are the advantages and disadvantages of using multithreading in Node.js? Which are those modules that can be achieve multithreading in Node.js? I am a newbie to Node.js, I read from many blogs saying that Node.js is single threaded.
I know the java multithreading but I need to know whether it is possible in Node.js or not.

Yes and No. Let's start from the beginning. Why is NodeJs single-threaded, is explained here Why is Node.js single threaded?
While Node.js itself is multithreaded -- I/O and other such operations run from a thread pool -- JavaScript code executed by Node.js runs, for all practical purposes, in a single thread. This isn't a limitation of Node.js itself, but of the V8 JavaScript engine and of JavaScript implementations generally.
Node.js includes a native mechanism for clustering multiple Node.js processes, where each process runs on a separate core. But that clustering mechanism doesn't include any native routing logic or shared state between workers.
Generally and more clearly the statement is that, each node.js process is single threaded .if you want multiple threads, you have to have multiple processes as well.
For instance,you can use child process for this, which is described here http://nodejs.org/api/child_process.html . And just for your info, check out also this article, is very instructive and well written, and possibly will help you, if you want to work with child_processes -- https://blog.scottfrees.com/automating-a-c-program-from-a-node-js-web-app
Despite of all of the above, you can achieve a kind of multi-threading with C++ and native nodejs C++ development.
First of all check out these answers, probably they will help you,
How to create threads in nodejs
Node.js C++ addon: Multiple callbacks from different thread
Node.js C++ Addon: Threading
https://bravenewmethod.com/2011/03/30/callbacks-from-threaded-node-js-c-extension/
Of course you can find and leverage a lot of node plugins which are giving "multi"-threading capability: https://www.npmjs.com/search?q=thread
In addition, you can check JXCore https://github.com/jxcore/jxcore
JXCore is fork of Node.js and allows Node.js apps to run on multiple threads housed within the same process. So most probably JXCore is a solution for you.
"What are the advantages and disadvantages of using multi-threading in Node.js ?"
It depends of what you want to do. There are no disadvantages if you leverage and use Node.js sources correctly, and your "multi" - threaded plugins or processes or whatever, do not "hack" or misuse anything from the core of V8 or Node.js !
As in every answer, the correct answer is "use the right tools for the job".
Of course, since node is by design single-threaded, you can have better approaches for multithreading.
A technique that a lot of people use, is to make their multi-threaded application in C++, Java, Python e.t.c and then, they run it via automation and Node.js child_process (third-party application runs asynchronously with automation, you have better performance (e.g C++ app), and you can send input and get output in and from your Node.js application).
Disadvantages multi-threading Node.js
Check this: https://softwareengineering.stackexchange.com/questions/315454/what-are-the-drawbacks-of-making-a-multi-threaded-javascript-runtime-implementat
Keep in mind that if you want to create a pure multithreaded environment in Node.js by modifying it, I suppose that would be difficult, risky due to the complexity, moreover you have to be, always up to date with each new V8 or Node release that will probably affect this.

No, you can't use threads in node.js. It uses asynchronous model of code execution. Behind the asynchronous model, the node itself uses threads. But as far as I know, they can't be accessed in the app without additional libraries.
With the asynchronous model you don't actually need threads. Here is a simple example. Normally, in multi-threaded environments, you would run networks requests in each thread to not block the execution of code in main thread. With async model, those requests do not block the main thread and are still executed in other threads, only this is hidden from you to make development process straightforward.
Also check this comment by bazza.

Why is org/arangodb/request synchronous?

Why is the new JavaScript module request synchronous? Is it supposed to be only used in a job queue?
Is there any way to make asynchronous http(s) requests in ArangoDB?

Full disclosure: I'm part of ArangoDB's development team and primarily work on Foxx and everything JavaScript. I'm also the guy who wrote the org/arangodb/request module.
ArangoDB is a different environment than Node.js, despite sharing many similarities (such as using the V8 JavaScript engine). Unlike Node.js (or the browser), ArangoDB uses a thread-based concurrency model and doesn't feature an Event Loop. However the threads are not exposed in JavaScript (and in fact in V8 every thread is fully isolated) so you normally don't even have to think of them.
In the browser and in Node.js functions like setTimeout work by delaying code execution via the Event Loop (until a certain amount of time has passed or until an external event has occurred).
In ArangoDB the code is always executed linearly. For example, incoming HTTP requests are passed to Foxx controllers in JavaScript and the response is sent as soon as the controller returns. Even if you could use setTimeout, the external resources you were working with (or even "internal" ones like the document collections and transactions) would likely be already gone by the time the delayed code could execute.
Because of this, the request function provided by the org/arangodb/request module is also entirely synchronous. Instead of returning a promise or taking a callback it directly returns the incoming response data. It is also decidedly not the same module as request on npm but rather a synchronous implementation based on that module's API to the extent that implementing its API is possible outside Node.js (e.g. not including streams and returning the remote response instead of taking callbacks).
If you come from a Node.js/io.js background, this may feel wrong because non-blocking IO can achieve higher throughput, but keep in mind that the design goals of ArangoDB and Node.js are very different. Node.js is built around streams and network connections. ArangoDB is built as a persistent data storage and has to deal with transactions and locks instead.
It is probably not the best idea to access external APIs directly from your Foxx controllers if you have a high likelihood of serious network latency or if the external API's response is not essential to the client response. This is what the Foxx queues are for. Transactional e-mails are a prime example for this.
While Foxx is very versatile, its primary focus is to allow you to move most of your application (especially logic that benefits from running closer to the data) directly into the database. For small to medium scale projects that, you can probably get away with doing external API calls in-bounds. But if your application is primarily concerned with talking to other services over the network, running that code in a database is probably not the optimal solution.
Luckily ArangoDB plays well with others, so it's easy to move your network-intensive code out of Foxx if you find that it becomes a performance bottleneck at higher loads. Foxx doesn't eliminate the need for application servers, but it can considerably reduce their complexity.
As a correction to Brian's answer: sadly promises won't let you write async code in a synchronous environment either. The Promises/A+ spec defines promises as having to be executed asynchronously. Where they aren't natively supported they still have to be built on top of existing functions like setTimeout or process.nextTick, neither of which ArangoDB implements.

In a native node module, how can I make sure that my async code is always running on the same thread?

I'm writing a native node module in C++ which will be a binding for a C library.
Some of the objects in this library must only be used by a single thread. Which means that if I use uv_queue_work I can't make sure they are only used by the same thread, since - as far as I know - libuv uses a thread pool and I haven't been able to find out how to tell it what thread to use for this kind of work.
Here are some ideas for the situation, but I'm not sure which is the correct approach.
Simply make all the methods synchronous - this would unfortunately beat the purpose and concepts of node, so I'd prefer not to
Create a custom thread and execute my code on that - this would defeat the purpose of libuv's thread pool and require more work
Tell libuv somehow to execute operations of the same object on the same thread in its thread pool - I haven't found a way in the documentation to do this
What is the recommended course of action for this kind of Node.js module?

While I'll start by saying it's unfortunate that the architecture doesn't support the generic callback model, I'll accept it is a special case that cannot be avoided.
You still have full access to the libuv API in a native module, so it's entirely possible to create your own thread use that single thread to schedule all the applicable asynchronous work. For a quick primer check out http://nikhilm.github.io/uvbook/threads.html
After the operation is complete you can pass the desired js callback to MakeCallback. This should allow any js API interactions appear normal.

What exactly is a node in node.js?

In Erlang I was able to immediately understand the notion of a 'node' - a self-contained Erlang VM. I could start a node on one machine with erl -name gandalf -setcookie abc, and another node on another machine (on the same LAN) with erl -name bilbo -setcookie abc. I could then spawn processes on gandalf which would communicate magically with other processes on bilbo. Now, since I also wanted to serve up a jazzy webpage with animated graphical results from my Erlang processes, I picked up some Javascript and learnt jQuery. Still a humble paduwan, but I sort of understand how Javascript fits into the scheme of things.
I recently came across node.js and an (evil) voice started whispering: 'This is it! Now you can do everything with Javascript! Forget Erlang and guards and periods, stick to a language that everyone uses'.
I've read the docs a bit, but I still don't understand what a node is in node.js. Do I have to run a http server and that becomes my node? What if I don't like http, or I don't care how gandalf talks to bilbo - that's what I like in Erlang. Maybe I nai:vely expect that node.js is erlang with Javascript sugar?

Node.js has much more in common with Twisted than Erlang/OTP. Node.js is just a single threaded SEDA event loop. Node.js has nothing compared to Erlang VM when it comes to distribution, hot code reloading, and scalability via processes, it isn't anything close to "Erlang with Javascript sugar"

Maybe because of your Erlang knowledge you thought that somehow Node.js had something to do with "nodes" (as erlang nodes), but it's just the name.
The main idea with Node.js is that you defer all expensive I/O operations and assign callbacks to the result of those operations. The reason is that I/O blocks the (only) process that is running at the moment. Node.js will handle this for you, given that you are coding in the proper way.
An easy example of this is a database call:
result = SQL.query("EXPENSIVE SELECT HERE")
doSomething(result);
moreStuff(); // This line must wait until the previous ones are completed.
In node you would code this in a very different way:
SQL.query("EXPENSIVE SELECT HERE", function(result) {
doSomething(result);
});
moreStuff(); // This line executes inmediately
If you have wrong code in your Node.js script, like:
while(true) { }
Then you are blocking the process and it won't be able to handle more requests than the current one, so in Node.js is mandatory to follow the above guidelines.

As I understand it, a Node.JS node is an instance of the V8 engine with the Node.JS runtime and event-loop running in it. While the Node.JS runtime gives you the ability to very quickly and simply begin processing HTTP requests, it's not mandatory; it is very good at handling most any kind of asynchronous I/O, really.
I don't know that much about Erlang, but my superficial understanding is that its great strength is high-concurrency computing. Node.JS doesn't specialize in that, per se. Its heart is "evented I/O", dealing neatly and cleanly with asynchronous I/O.

there is no "node" in node.js
as mentioned, when you run
node my_script.js
you are running one instance of V8 java script interpreter
(which is using one core for its lifetime).

Develop Reference

JavaScript is the programming language of the Web.