Why is the new JavaScript module request synchronous? Is it supposed to be only used in a job queue?
Is there any way to make asynchronous http(s) requests in ArangoDB?
Full disclosure: I'm part of ArangoDB's development team and primarily work on Foxx and everything JavaScript. I'm also the guy who wrote the org/arangodb/request module.
ArangoDB is a different environment than Node.js, despite sharing many similarities (such as using the V8 JavaScript engine). Unlike Node.js (or the browser), ArangoDB uses a thread-based concurrency model and doesn't feature an Event Loop. However the threads are not exposed in JavaScript (and in fact in V8 every thread is fully isolated) so you normally don't even have to think of them.
In the browser and in Node.js functions like setTimeout work by delaying code execution via the Event Loop (until a certain amount of time has passed or until an external event has occurred).
In ArangoDB the code is always executed linearly. For example, incoming HTTP requests are passed to Foxx controllers in JavaScript and the response is sent as soon as the controller returns. Even if you could use setTimeout, the external resources you were working with (or even "internal" ones like the document collections and transactions) would likely be already gone by the time the delayed code could execute.
Because of this, the request function provided by the org/arangodb/request module is also entirely synchronous. Instead of returning a promise or taking a callback it directly returns the incoming response data. It is also decidedly not the same module as request on npm but rather a synchronous implementation based on that module's API to the extent that implementing its API is possible outside Node.js (e.g. not including streams and returning the remote response instead of taking callbacks).
If you come from a Node.js/io.js background, this may feel wrong because non-blocking IO can achieve higher throughput, but keep in mind that the design goals of ArangoDB and Node.js are very different. Node.js is built around streams and network connections. ArangoDB is built as a persistent data storage and has to deal with transactions and locks instead.
It is probably not the best idea to access external APIs directly from your Foxx controllers if you have a high likelihood of serious network latency or if the external API's response is not essential to the client response. This is what the Foxx queues are for. Transactional e-mails are a prime example for this.
While Foxx is very versatile, its primary focus is to allow you to move most of your application (especially logic that benefits from running closer to the data) directly into the database. For small to medium scale projects that, you can probably get away with doing external API calls in-bounds. But if your application is primarily concerned with talking to other services over the network, running that code in a database is probably not the optimal solution.
Luckily ArangoDB plays well with others, so it's easy to move your network-intensive code out of Foxx if you find that it becomes a performance bottleneck at higher loads. Foxx doesn't eliminate the need for application servers, but it can considerably reduce their complexity.
As a correction to Brian's answer: sadly promises won't let you write async code in a synchronous environment either. The Promises/A+ spec defines promises as having to be executed asynchronously. Where they aren't natively supported they still have to be built on top of existing functions like setTimeout or process.nextTick, neither of which ArangoDB implements.
Related
From my understanding javascript will either run in the browser or as a backend in Node.js.
The browser or Node.js, depending on where you run your javascript, will via web API's or c++ API's handle functions that block the runtime (i.e network calls, image rendering, etc), then send them to the event loop and eventually merge them into the single thread that javascript runs on.
What I don't understand is, when I google "is javascript synchronous or asynchronous", the answer is javascript is asynchronous.
But is that true? Javascript is asynchronous because of the web API's or c++ API's in the browser or Node.js backend, that makes threads under the hood, but javascript itself isn't asynchronous then?
If javascript only has one thread it must be a synchronous language?
Javascript (as implemented in the browser and in node.js) is an event driven system.
That means that it works best when used with non-blocking, asynchronous I/O that gives you the best experience and features in coordination with the event driven system. This isn't necessarily inherent in Javascript the language itself (you could make a version of Javascript that had nothing but blocking I/O), but all the popular implementations of Javascript depend upon an event queue and depend upon asynchronous I/O working in coordination with the event queue to offer a useful programming environment.
Until recently, Javascript also didn't have useful threads and useful thread synchronization tools to make a multi-threaded system with blocking I/O practical or useful. There are now threads in both the browser and node.js, though the threads in node.js are pretty heavy-handed (a whole new instance of the V8 interpreter, separate heap, etc...) so they would not necessarily be performance competitive with systems that have threads built in as more of an inherent feature. Plus the thread synchronization tools in Javascript are fairly early in their development.
What I don't understand is, when I google "is javascript synchronous or asynchronous", the answer is javascript is asynchronous.
Current popular implementations of Javascript are in environments that require asynchronous I/O in order to be productive. It's not necessarily required in the pure language all by itself, though I don't know of any implementations that assume threads and blocking I/O.
Javascript is asynchronous because of the web API's or c++ API in the browser are Node.js backend that makes threads under the hood, but javascript itself isn't asynchronous then?
A Javascript environment has asynchronous capabilities because Javascript is paired with an event driven environment and is paried with asynchronous operations such as timers and I/O. So, the combination of the Javascript implementation and the other things the environment adds to it make an environment capable of writing code that can use asynchronous features. Please don't too hung up on the semantic argument about whether Javascript is or isn't asynchronous itself. As best I know, the ECMAScript specification that specifies the Javascript language doesn't necessarily require that. I think there could exist an implementation of the pure Javascript language with no asynchronous capabilities. But, most of what you read on the web or in books will refer to "Javascript" when what they really mean are the popular implementations of Javascript such as in a web browser or in node.js. And, frankly, that's mostly what is relevant since that's where you can actually use Javascript unless you're going to build your own custom environment.
If javascript only has one thread it must be a synchronous language?
It's not entirely clear what you mean by this question. By default (without invoking webWorkers or Worker Threads) Javascript runs your Javascript code in one single thread, but it has access to non-blocking I/O functions that allow operations to run in parallel with your Javascript. In a browser, you can make an Ajax call to your server, then go do something else while that Ajax call is finishing (make some calculations, update the screen, update a clock on screen, etc...) and then when a completion notification arrives from the Ajax call, you can process the results. While your actual lines of Javascript were run one after another synchronously, you were allowed to start asynchronous operations and thus run some things in parallel with your Javascript execution. I will avoid debating whether one wants to call it a "synchronous language" or not. That's just a semantic argument. It works the way it works, running your Javascript in a single thread, but taking advantage of native OS capabilities to run other things in parallel with the Javascript (like network operations).
So i'm trying to create a system which users can match each other by specific information,
the flow i have in mind is as follows:
user 1 fills the information and clicks "find"
at the same time user 2 does the same as user 1
the client sends a request to the server in route /X so the server can push the client to a (threadsafe)queue
a worker thread pulls out from the queue each time and do the matching
meanwhile the user polls route /Y in the server to get his match
the worker thread finds 2 users match and pushes it to some (threadsafe)data structure
next time the user polls the server(in /Y), the user gets the match and is redirected to the conversation
so first of all is this a good approach?
and also is using a worker thread and threadsafe datastructure logical in javascript?(specifically Nodejs and express) is there an alternative or a better way to do this kind of stuff?
thanks.
This is a bad approach.
You do not need (and should not use) worker threads for your use case.
On Worker Threads
Worker Threads are isolated instances of Javascript which run as a separate thread. They are intended strictly for performing CPU-intensive work.
vs vanilla Node
But you don't need them, because Node libraries are asynchronous, which means that unless your code really is CPU-intensive, you won't see any benefit from using Worker Threads (in fact there is overhead to using them, so if they aren't needed, your code will run slower).
From the docs: "Workers (threads) are useful for performing CPU-intensive JavaScript operations. They will not help much with I/O-intensive work. Node.js’s built-in asynchronous I/O operations are more efficient than Workers can be."
More on Threadedness
Javascript is single-threaded, and works very well that way. There is no concept of "threadsafe" in Javascript, because it isn't needed; all code is threadsafe.
If you do have CPU-intensive code
If you're doing expensive regex matching, then you are right to want to run this code in parallel. Worker Threads might not be the best way to do this, though.
Splitting CPU-intensive code into separate programs is often the most flexible solution. It gives you several options:
spawn a new instance of Node run your CPU-intensive code (on the same server)
run your CPU-intensive code in the cloud on "serverless" services, such AWS Lambda
turn your CPU-intensive code into a "microservice", essentially a tiny webserver which does any specialized processing and returns the result
Further Reading
How Node's asynchronicity works (big picture)
https://blog.insiderattack.net/event-loop-and-the-big-picture-nodejs-event-loop-part-1-1cb67a182810
What kinds of operations block the event loop and how to avoid it
https://nodejs.org/uk/docs/guides/dont-block-the-event-loop/
So, maybe this question is too noob and novice to be asked but I still have no clue why LIBUV got a place in Node JS Architecture? So here is my understanding of NodeJs architecture.
Node Js is built over V8
V8 is capable of running code written with EcmaScript standards.
V8 is written in C++.
So if you want to give any new functionality we can embed V8 in our C++ project and attach new code with new Embedded V8 in C++.
Now here is the doubt,
Since V8 supports EcmaScript Javascript that means it has the capability to run callbacks written with the standards of EcmaScript.
So we can add code for File System Access, HTTP server & DB access in C++ since there are libraries (header files) that gives that functionality since Java is written in C++ (correct me if I am wrong) and Java has the capability to do the same.
Now if we can add this functionality in C++ where does the place for Libuv come into the picture of NodeJs architecture.
Thanks in advance and
Happy Coding :)
Check the docs below -
https://nodejs.org/en/docs/meta/topics/dependencies/#libuv
Another important dependency is libuv, a C library that is used to
abstract non-blocking I/O operations to a consistent interface across
all supported platforms. It provides mechanisms to handle file system,
DNS, network, child processes, pipes, signal handling, polling and
streaming. It also includes a thread pool for offloading work for some
things that can't be done asynchronously at the operating system
level.
So to sum it up, V8 provides the functionalities related to running JS files, but to use system resources like Network, Files, etc., libuv is used. Also it provides a threading model for accessing the resources mentioned.
The libuv module has a responsibility that is relevant for some particular functions in the standard library. for SOME standard library function calls, the node C++ side and libuv decide to do expensive calculations outside of the event loop entirely.They make something called a thread pool that thread pool is a series of four threads that can be used for running computationally intensive tasks such as hashing functions.
By default libuv creates four threads in this thread pool. So that means that in addition to that thread used for the event loop there are four other threads that can be used to offload expensive calculations that need to occur inside of our application. Many of the functions include in the node standard library will automatically make use of this thread pool.
Now the presence of this thread pool is very significant. Well clearly Node.js is not truly single threaded
Libuv also gives node access to the operating system’s underlying file system such as networking. So just as the node standard library has some functions that make use of libuv thread pool it also has some functions that make use of code that is built into the underlying operating system through libuv.
Simple Http request
const https=require(“https”)
const start=Date.now()
https.request(“https://www.google.com”,res=>{
res.on(“data”,()=>{} )
res.on(“end”,()=>{console.log(Date.now()-start) }) }).end()
So in this case libuv sees that we are attempting to make an HTTP request. Neither libuv nor node has any code to handle all of this low level operations that are involved with a network request. Instead libuv delegates the request making to the underlying operating system. So it's actually our operating system that does the real HTTP request Libuv is used to issue the request and then it just waits on the operating system to emit a signal that some response has come back to the request. So because Libuv is delegating the work done to the operating system the operating system itself decides whether to make a new threat or not. Or just generally how to handle the entire process of making the request.
If anyone stumbles upon this and since it lacks a good answer to the OP's question, I will try to take on this.
TLDR;
Javascript language is not asynchronous
Javascript language is not multi-threaded
Callbacks themselves are not asynchronous, they are just mean to piggyback your code to an asynchronous operation.
Let's go over your doubts one by one.
1. Since V8 supports EcmaScript Javascript that means it has the capability to run callbacks written with the standards of EcmaScript.
Callbacks don't mean that the operation is asynchronous. A callback has got nothing to do with asynchronous execution. Callback is just a way to piggyback your function so that it executes after 'something asynchronous'.
// example of synchronous callback
function main(cb) {
console.log('main code of the function');
cb(); // callback invocation here
}
main(function () {
console.log('in callback');
});
Now an example of asynchronous callback
function getDataFromNetwork(url, cb) {
ajaxCall(url).then(cb);
}
getDataFromNetwork('http://some-endpoint', function (data) {
console.log(data);
});
This is an asynchronous call with a callback. Here getDataFromNetwork function is asynchronous not the callback. The point is that callbacks are just a mechanism of running a code after something. In an asynchronous operation, this becomes a necessity. How else we are going to do that? right?
No!
Nowadays we have async-await where you can run a code after the asynchronous function completes without using callbacks.
So you get that? Callbacks are not asynchronous. And that's not the point of having libuv.
2. So we can add code for File System Access, Http server & DB access in C++ since there are libraries (header files) that gives that functionality since Java is written in C++ (correct me if I am wrong) and Java has the capability to do the same.
Yes we can add lots of code for File System Access, Http server. But Why? We do already have a lot of libraries to do that. And yes its already written in C thats how NodeJS executes them.
Java already has that?
Right, but thats also a part of JVM rather than the core Java language, just like libuv is part of NodeJS runtime rather than the core Javascript language. In this regard both Java and NodeJS are similar. Its just that Java has its own C++ layer and NodeJS borrows libuv for that. BTW libuv was primarily built for NodeJS
3. Now if we can add these functionality in C++ where does the place for Libuv come in to the picture of NodeJs architecture.
I answered how these functionalities are already in C++, now lets see where libuv fits in this picture of the whole architecture.
Lets take an ajax/network call for example. Who do you think executes this?
NodeJS? No, It just gives instruction to its C++ API (Node API).
then is it Node API? No, It just gives instruction to the libuv
then is it libuv? Yes, it is
Same goes for timers, file access, child processes etc.
Also think when a lot of network calls, file access are fired within a NodeJS program, on what process it runs? who schedules them? who notify about the results and failure.
This is a lot to do. Java has its own thread pool to do that. Java has its own schedular to schedule the threads. and since Java provides threads to end user(programmers) as well. It makes sense to implement all that stuff using Java threads.
But NodeJS is single-threaded. Why it should have threads to execute I/O operations when it can borrow it from another library without making them a part of Javascript? After all, we aren't going to provide threads to the programmer so why bother?
Also Historically, Javascript was only meant to run in browsers. The only asynchronous operation browsers had access to were network requests, no file access, no DB. So we did not have a lot of bedrock already to build upon.
How and when is the single threaded asynchronous processing model of nodejs a better approach than the multithreaded approach of the known server Gurus like PHP, Java and C#?. Can someone please explain to me simply and clearly?
My question is how technically is the single threaded asynchronous processing model a better approach ?
Grasping the Node JS alternative to multithreading
Node.js was created explicitly as an experiment in async processing. The theory was that doing async processing on a single thread could provide more performance and scalability under typical web loads than the typical thread-based implementation.
The single threaded, async nature does make things complicated. But do you honestly think it's more complicated than threading? One race condition can ruin your entire month! Or empty out your thread pool due to some setting somewhere and watch your response time slow to a crawl! Not to mention deadlocks, priority inversions, and all the other gyrations that go with multithreading.
But is it really single threaded. Read this article https://softwareengineeringdaily.com/2015/08/02/how-does-node-js-work-asynchronously-without-multithreading/
Node.js is built on top of Google's V8 engine, which in turns compiles JavaScript. As many of you already know, JavaScript is asynchronous in nature. Asynchronous is a programming pattern which provides the feature of non-blocking code i.e do not stop or do not depend on another function / process to execute a particular line of code.Asynchronous is great in terms of performance, resource utilization and system throughput. But there are some drawbacks:
Very difficult for a legacy programmer to proceed with Async.
Handling control flow is really painful.
Callbacks are dirty.
NodeJS is single threaded and it is not a deterrent or a performance block really. The single threaded event loop is super efficient and is much less complicated than deploying effective multithreading. Multi-threading does not always mean better performance.
Having said that, if you do need to handle heavy concurrency, then you can employ the services of the cluster module which splits multiple NodeJS processes across available CPU cores, all the while maintaining a link with a master process which can be used to control/offload processing tasks.
Node was built from the ground up with asynchronicity in mind, leveraging the event loop of JavaScript. It can handle a lot of requests quickly by not waiting around for the request when there are certain kinds of work being done for the request, such as database requests.
Imagine you have a database operation that takes 10 seconds to complete, represented by a setTimeout
router.route('/api/delayed')
.get(function(req, res) {
setTimeout(function() {
res.end('foo');
}, 10000);
});
router.route('/api/immediate')
.get(function(req, res) {
res.end('bar');
});
or a back end framework that does not support asynchronous execution, this situation is an anti-pattern: the server will hang as it waits for the database operation to complete and then fulfill the request. In Node, it fires off the operation and then returns to be ready to field the next incoming request. Once the operation finishes, it will be handled in an upcoming cycle of the event loop and the request gets fulfilled.
As long as we only write non-blocking code, our Node server will perform better than other backend languages
After reading about it in the book: Web Development with MongoDB and Node.js 2nd Edition by Maithun Satheesh, Jason Krol and Bruno Joseph D'mello, I finally came across a clear advantage
To understand this, we should understand the problem that Node.js
tries to resolve. It tries to do asynchronous processing on a single
thread to provide more performance and scalability for applications
that are supposed to handle too much web traffic. Imagine web
applications that handle millions of concurrent requests; if the
server makes a new thread for handling each request that comes in, it
will consume a lot of resources and we would end up trying to add
more and more servers to increase the scalability of the application.
The single threaded asynchronous processing model has its advantage
in the previous context, and you can process much more concurrent
requests with less number of server-side resources.
And I notice that one can process much more concurrent requests with less serverside resources
my 2 pence worth.... i am not sure sure "if the the single-threaded approach of nodejs is better" : simply put, nodejs does not support multi-threading. that can translate loosely to " everything runs in a single thread". Now, I am not quite sure how it can "compare" to a multi-threaded system , as a "multi" threaded system can support both as a single thread (like nodejs ) and multiple threads . Its all in your application design , and the platform capabilities that are available to you.
What is more important, in my opinion, is the ability to support multi-tasking, in an asynchronous way .Nodejs, does provide support for multi- tasking, in a simplified and easy-to use package. It does have limitations on due to the lack of native support for multi-threading. to take advantage of the multi-tasking ( and not worry.. much ) about multi-threading, think along the lines of designing your serverside application as performing little chunks of work over a long period of tie , and each chunk of work is invoked, and consuming events generated from the clientside . Think an event-driven design/architecture ( simple switch/case for loops, callbacks, and data checkpointing to files or database, can do the trick). And I will bet my tiny dollar , that if you get your application to work in this fashion, sans multi-threading, it will be a much better design , more robust, and if you migrate it ( and adapt for multi-threading ) it run like on an SpaceX buster!
While multi-threading is aways a plus for serverside implementation, it is also a powerful beast that requires a lot of experience and respect to tame and harness ( something that nodejs shields/protects you from)
Another way to look at is is this : Multi-tasking is a perspective ( of running several tasks) at the application level , which multi-threading is a perspective, at a lower level : Multi-tasking can be mapping on to different implementations, with multi-threading being one of them .
Multi-threading capability
Truth : Node.js ( currently ) does not provide native support for multi-threading in the sense of low level execution/processing threads. Java, and its implementations /frameworks, provides native support for multi-threading, and extensively too ( pre-emption, multi-tenancy, synchronous multi-threading, multi-tasking, thread pools etc )
Pants on Fire(ish) : lack of multi-threading in Nodejs is a show stopper. Nodejs is built around an event driven architecture , where events are produced and consumed as quickly as possible. There is native support for functional call backs. Depending on the application design, this highlevel functionality can support what could otherwise be done by thread. s
For serverside applications, at an application level , what is important is the ability to perform multiple tasks, concurrently :ie multi-tasking. There are several ways to implement multi-tasking . Multi-threading being one of them, and is a natural fit for the task. That said, the concept of “multi -threading “ is is a low level platform aspect. For instance multi-threaded platform such as java , hosted/running on a single core process server ( server with 1 CPU processor core) still supports multi-multi at the application level, mapped to multi-threading at the low level , but in reality , only a single thread can execute at any ontime. On a multi-core machine with sa y 4 cores , the same multi-tasking at application level is supported , and with up to 4 threads can executing simultaneously at any given time. The point is, in most cases, what really matters is the support for mult-tasking, which is not always synonymous with multi-threading.
Back to node.js , the real discussion should be on application design and architecture , and more specifically, support for MULTI-TASKING. In general, there is a whole paradigm shift between serverside applications and clientside or standalone applications, more so in terms of design, and process flow. Among other things, server side applications need to run along side other applications( onthe server ), need to be resilient and self contained (not affect the rest o f the server when the application fails or crashes ) , perform robust exception handling ( ie recover from errors, even critical ones ) and need to perform multiple tasks .
Just the ability to support multi-tasking is a critical capability for any server side technology . And node.js has this capability, and presented in a very easy to use packaging . This all means that design for sever side applications need to focus more on multi-tasking, and less on just multi-threading. Yes granted, working on a server-side platform that supports multi-threading has its obvious benefits ( enhanced functionality , performance) but that alone does not resolve the need to support multi-tasking at the application level . Any solid application design for server side applications ,AND node.js must be based on multi-tasking through event generation and consumption ( event processing). In node.js , the use of function callbacks, and small event processors (as functions ), with data checkpointing ( saving processing data , in files or data bases) across event processing instances is key.
What else for Node.js vs Java
a whole lot more! Think scalability , code management , feature integration , backward, forward compatibility , return on investment , agility, productivity …
,… to cut on “verbosity” of this article , pun intended :), we will leave it at this for now :)
whether you agree or not, please shoot the messenger ( Quora) and not the opinions!
I am fascinated with non-blocking architectures. While I haven't used Node.js, I have a grasp of it conceptually. Also, I have been developing an event-driven web app so I have a fundamental understanding of event programming.
How do you write non-blocking javascript in the browser? I imagine this must differ in some ways from how Node does it. My app, for example, allows users to load huge amounts of data (serialized to JSON). This data is parsed to reconstitute the application state. This is a heavy operation that can cause the browser to lock for a time.
I believe using web workers is one way. (This seemed to be the obvious choice, however, Node accomplishes a non-blocking, event-driven architecture I believe without using Web Workers so I guess there must be another way.) I believe timers can also play a role. I read about TameJS and some other libraries that extend the javascript language. I am interested in javascript libraries that use native javascript without introducing a new language syntax.
Links to resources, libraries and practical examples are most appreciated.
EDIT:
Learned more and I realize that what I am talking about falls under the term "Futures".
jQuery implements this however, it always uses XHR to call a server where the server does the processing before returning the result and what I am after is doing the same thing without calling the server, where the client does the processing but in a non-blocking manner.
http://www.erichynds.com/jquery/using-deferreds-in-jquery/
Three are two methods of doing non-blocking work on the browser
Web Workers. WebWorkers create a new isolated thread for you to do computation in, however browser support tells you that IE<10 hates you.
not doing it, expensive work in a blocking fashion should not be done on the client, send an ajax request to a server to do this, then have the server return the results.
Poor man's threads:
There are a few hacks you can use:
emulate time splicing by using setTimeout. This basically means that after every "lump" of work you give the browser some room to be responsive by calling setTimeout(doMore, 10). This is basically writing your own process scheduler in a really poor non optimized manner, use web workers instead
creating a "new process" by creating a iframe with it's own html document. In this iframe you can do computation without blocking your own html document from being responsive.
What do you mean by non-blocking specifically?
The longest operations, Ajax-calls, are already non-blocking (async)
If you need some long-running function to run "somewhere" and then do something, you can call
setTimeout(function, 0)
and call the callback from the function.
And you can also read on promises and here as well