How does Node.js load scripts into memory?

How does Node.js load scripts into memory? - javascript

In the browser, the DOM is parsed and scripts are loaded and parsed in the order they are defined.
In Node.js, how are scripts loaded into memory?
Is the entire graph of scripts defined by the require statements in each file traversed at initialisation-time, with the resulting objects and values hydrating the stack and heap ready for execution to start?

Synchronously. Whenever it encounters a require it synchronously loads the script and runs it - then, when other scripts are found it synchronously loads them.
IIRC in the 0.2 days there was an asynchronous version but it's not here for a long time. As for what it actually does:
Basically, what it does is a fs.readFileSync.
More specifically - Calling require calls _load which in turn first checks the cache and then it creates the module and it calls the relevant extension. Since multiple extensions are allowed (for example .json) it loads each one differently, in the .js case which is the common case it just calls fs.readFileSync and then compiles it (which involves wrapping it, injecting exports and running it).

Related

How do I calculate the load time penalty for Node.js package?

To give some context, I'm interested in learning how to optimise the cold start time of a Node.js Express application running onboard a Google Cloud Function.
So far, I've learned the biggest penalty for cold boots is the loading of dependencies using require statements. As an software engineer a scientific mind tells me it will be based on number of files, size of files, number of dependencies, caching. However, perhaps optimising is more of an art than science so any pointers or feedback from your own experience is most appreciated.
My questions are
If I surround a const x = require('x') with two process.hrtime() statements and measure the time difference, will I be measuring the load time for the entire package load time of 'x'?
If so, does this include the loading of all files within this package? How about the dependencies this package requires- when are they loaded (which leads to my third question 3).
If a require statement is inside a conditional block e.g. if (condition) { const x = require('x'); }, at what point is it 'loaded' and what does 'loaded' really mean (i.e. in memory, parsed, executed etc)? Will it 'load' at the moment in runtime the statement is reached (or not), or will the require happen regardless when the program begins execution?

Yes, require is just a normal function which either returns the module if it's in memory or loads it if it isn't, which means reading the file, parsing it and executing it (and of course it may involve requiring other dependances).
There's no problem benchmarking it (just make sure you measure the first require of a file, as the module is cached).
If a require is in your file but isn't executed (for example because it's behind an if statement), it won't have any more effect than if it weren't in your file. And the loading won't happen before the statement is reached.

What does synchronous vs asynchronous loading mean?

Reading from this site, I understand that using commonjs means that when the browser finishes downloading your files, it will have to load them up 1 by 1 as they depend on each other. But with AMD, it can load multiple ones at the same time so that even if file a depends on file b, it part of file a can be executed before file b is finished?
CommonJS Modules: The dominant implementation of this standard is in
Node.js (Node.js modules have a few features that go beyond CommonJS).
Characteristics: Compact syntax Designed for synchronous loading and
servers
Asynchronous Module Definition (AMD): The most popular
implementation of this standard is RequireJS. Characteristics:
Slightly more complicated syntax, enabling AMD to work without eval()
(or a compilation step) Designed for asynchronous loading and browsers

Synchronous programming is executing code line by line. Same with loading.
It will load 1 by 1 whatever that you are loading.
Real world example: You are in a queue in cinema for a movie ticket.
Asynchronous would be many people in restaurant. You order food and other people order food. They dont need to wait for your order to finish.
Everyone can order but you dont know when the order will come. Same as with loading. You can load multiple things at the same time or different intervals but it doesnt guarantee that it will come in that order.
I hope the explanation is good enough.

The syntax with CommonJS in loading modules is as such:
var MyModule = require("MyModule");
As you can see, this will block the thread until the module is downloaded, from either your filesystem or the web. This is called synchronous loading. This is impossible to achieve in a normal web browser environment without affecting user experience, since we cannot block the thread as the browser uses the thread to update the graphics.
With RequireJS, it's done as such:
// In your module
define(["dependencies", ...], function(){
return MyModule;
})
// In your web page
require(["dependencies", ...], function(MyModules, ...){
// do stuff here
});
With this model, our web page does not depend on the timing of when the module should be loaded. We can load our scripts in parallel while the page is still being loaded. This is called asynchronous loading. Once the scripts are loaded, they will call define which notifies RequireJS that the scripts are indeed loaded and executed. RequireJS will then call your main function and pass in the initialized modules.

Asynchronous loading JavaScript functions.

I am building a framework in which I have merged all JavaScript files into one file (minify).
Example:
function A() {} function B() {}
Through minified file i want to load function asynchronous and remove from HTML when its work is done.
Example: load function A when it is required but not function B.
I have seen one framework Require.js but in that it loads JavaScript file asynchronous based on requirement.
Is there any framework which loads JavaScript functions on demand which are defined in same js file.

The downside to concatenation is you get less fine-grained control over what you are including on a given page load. The solution to this is, rather than creating one concatenated file, create layers of functionality that can be included a little more modularly. Thus you don't need all your JS on a page that may only use a few specific functions. This can actually be a win in speed as well, since having just one JS file might not take advantage of the browsers 6 concurrent connections. Moreover, once SPDY is fully adopted, one large file will actually be less performant than more smaller ones (since connections can be reused). Minification will still be important, however.
All that said, it seems you are asking for something a little difficult to pull off. When a browser loads a script, it gets parsed and executed immediately. You can't load the file then... only load part of the file. By concatenating, you are restricting yourself to that large payload.
It is possible to delay execution by wrapping a script in a block comment, then accessing it from the script node and eval()ing it... but that doesn't seem like what you are asking. It can be a useful strategy, though, if you want to preload modules without locking the UI.

That's not how javascript works. When the function's source file is loaded, the function is available in memory. Since the language is interpreted, the functions that are defined would be loaded as soon as the source file was read by the browser.
Your best bet is to use Require.js or something similar if you want to have explicit dependency chains.

Does javascript execute if no function is called?

I have some wordpress pages with javascript code that require javascript file references. For pages that don't call functions within these js file references, there should be no performance impact for including these files (except the file call) right?
-- EDIT in response to #cdhowie --
If only certain pages require these javascript files, is it possible to move them out of the head section and into the page? I've read this is bad practice.
But in theory, this prevents the entire site from having a performance hit for files that are not being utilized?

The referenced JavaScript files will be downloaded (or fetched from the cache) and then be executed by the browser's JavaScript interpreter in both cases. The "JavaScript file references" need to be executed in order to create the variables and functions that you might use, and the browser has no way of knowing ahead of time if you will use them. Further, the included files might actually manipulate the document, and the browser doesn't know this either until it has executed them.
So yes, there will be a performance impact whether or not you call the functions. Whether or not it's significant enough for you to worry about is something you will have to determine. (Always profile your page's loading time before making decisions like this!)

Javascript functions are only executed when you explicitly call them (or implicitly in callbacks and whatnot). The code will however still be interpreted by the browser on each page regardless of functions being called or not.
Edit:
I was wrong to say the performance hit is irrelevant. It really all depends on your exact situation (where the code is coming from, how much code, etc.) and also how much you care about performance in terms of milliseconds.
One possible "performance" issue is if those extra .js files are on your server. If so and you are loading them when it is not needed, you are causing for unneeded traffic and bandwidth in regards to your server.

This will execute, but take up very little cpu time
<script type="text/javascript">
// just a comment
</script>
no functions, just a comment... but it's still "code", still has to be parsed, still has to be checked for syntax errors, etc...

What happens if you use a <script> tag with the same "src" attribute multiple times within a single HTML document?

Although I am almost certain the answer to this question will be browser specific, do any of the browsers define behavior for when multiple <script> tags are used and have the same src attribute?
For instance...
<script src="../js/foo.js"></script>
...

<script src="../js/foo.js"></script>
The reason I ask this question in the first place, is that in my particular case I am using partial views in an ASP.NET MVC application which make use of JQuery. The JQuery JS file(s) are all included in the master template file via script tags. I would prefer to add script tags to the partial view files so that in case they are used outside the context of the master template, they will automatically include all the necessary JS files, and not rely on another view or template to include them. However, I certainly don't want to cause JS files to have to be transferred to the client multiple times, or any other side effects that could negatively impact the user experience.
My thinking right now is that most, if not all, of the major browsers (FF, Safari, IE, Opera) will cache a JS file the first time it is used, and then on subsequent script tags the browser will use the cached copy if available and if it hasn't expired. However, caching behavior can usually be altered through browser configuration, so it doesn't seem too "safe" to rely on any kind of caching behavior.
Will I just have to accept the fact that my partial views are going to have be dependent on other templates or views including the appropriate JS files?

Even if they're cached, you may have problems since the same code will be executed twice. At the very least, this will cause the browser to take more time than necessary. And it may cause errors, since most JavaScript code isn't written to be executed twice. For example, it may attach the same event handlers twice.

Don't output script tags directly in your partials. Create a mechanism to register script files for later inclusion. That mechanism can be responsible for only including files once.

What happens is that JavaScript is feed into the interpreter the moment it is downloaded. In the event of a namespace collision only the variable name, of a given scope, survives to execution. Normally this last only process prevents problems from arising by overwriting functions feed into the interpreter earlier. The problem is that a function defines variable scope, which those variables could be other functions that introduce then other namespace scopes of variables. That is a problem because if functions share the same name value and include different variable definitions then there could be leakage where variables from a function feed into the interpreter early survive even after that function is overwritten, which can then cause expected namespace collisions.
If the exact same file is included twice there should be no problem. The problem occurs when different versions of the same file are included or different files with the same function names are included. Including the same file twice can mean multiple transmissions, which is a waste of bandwidth.

FF 3.5x, Chrome 4x include it only once.
:) IE 8 has two copies (view in Developer Tools > Scripts tab there are two jquery-1.3.2.min.js entries)

Develop Reference

JavaScript is the programming language of the Web.