My understanding of an interpreter is that it executes program line by line and we can see the instant results, unlike compiled languages which convert code, then executes it.
My question is, in Javascript, how does interpreter come to know that a variable is declared somewhere in the program and logs it as undefined?
Consider the program below:
function do_something() {
console.log(bar); // undefined (but in my understanding about an interpreter, it should be throwing error like variable not declared)
var bar = 111;
console.log(bar); // 111
}
Is implicitly understood as:
function do_something() {
var bar;
console.log(bar); // undefined
bar = 111;
console.log(bar); // 111
}
How does this work?
This concept of 'var hoisting' is quite a confusing one if you think of it on the surface. You have to delve into how the language itself works. JavaScript, which is an implementation of ECMAScript, is an interpreted language, meaning all the code you write is fed into another program that in turn, interprets the code, calling certain functions based on parts of your source code.
For example, if you write:
function foo() {}
The interpreter, once it meets your function declaration, will call a function of its own called FunctionDeclarationInstantiation that creates the function. Instead of compiling JavaScript into native machine code, the interpreter executes C, C++, and machine code of its own 'on demand' as each part of your JavaScript code is read. It does not necessarily mean line-by-line, all interpreted means it that no compilation into machine code happens. A separate program that executes machine code reads your code and executes that machine code on the fly.
How this has to with var declaration hoisting or any declaration for that matter, is that the interpreter first reads through all your code once without executing any actual code. It analyzes the code and separates it into chunks, called lexical environment. Per the ECMAScript 2015 Language Specification:
8.1 Lexical Environments
A Lexical Environment is a specification type used to define the association of Identifiers to specific variables and functions based upon the lexical nesting structure of ECMAScript code. A Lexical Environment consists of an Environment Record and a possibly null reference to an outer Lexical Environment. Usually a Lexical Environment is associated with some specific syntactic structure of ECMAScript code such as a FunctionDeclaration, a BlockStatement, or a Catch clause of a TryStatement and a new Lexical Environment is created each time such code is evaluated.
An Environment Record records the identifier bindings that are created within the scope of its associated Lexical Environment. It is referred to as the Lexical Environment’s EnvironmentRecord
Before any code is executed, the interpreter goes through your code and for every lexical structure, such as a function declaration, a new block, etc, a new lexical environment is created. And in those lexical environments, an environment record records all the variables declared in that environment, their value, and other information about that environment. That's what allows for JavaScript to manage variable scope, variable lookup chains, this value, etc.
Each lexical environment is associated with a code realm:
8.2 Code Realms
Before it is evaluated, all ECMAScript code must be associated with a Realm. Conceptually, a realm consists of a set of intrinsic objects, an ECMAScript global environment, all of the ECMAScript code that is loaded within the scope of that global environment, and other associated state and resources.
Every section of JavaScript/ECMAScript code you write is associated with a realm before any of the code is actually executed. Each realm consists of the intrinsic values used by the specific section of code associated with the realm, the this object for the realm, a lexical environment for the realm, among other things.
This means each lexical part of your code is analyzed before executing. Then a realm is created that houses all the information on that set of code. The source, what variables are needed to execute it, which variables have been declared, what this is, etc. In the case of var declarations, a realm is created, when you define a function like you did here:
function do_something() {
console.log(bar); // undefined
var bar = 111;
console.log(bar); // 111
}
Here, a FunctionDeclaration creates a new lexical environment, associated with a new realm. When a lexical environment is created, the interpreter analyzes the code and finds all declarations. Those declarations are then first processed at the very beginning of that lexical environment, thus the 'top' of the function:
13.3.2 Variable Statement
A var statement declares variables that are scoped to the running execution context’s VariableEnvironment. Var variables are created when their containing Lexical Environment is instantiated and are initialized to undefined when created.
Thus, whenever a lexical environment is instantiated (created), all the var declarations are created, initialized to undefined. That means they are processed before any code is executed, at the 'top' of the lexical environment:
var bar; //Processed and declared first
console.log(bar);
bar = 111;
console.log(bar);
Then, after all your JavaScript code is analyzed, it is finally executed. Because the declaration was processed first, it is declared (and initialized to undefined) giving you undefined.
Hoist is kind of a misnomer really. Hoist implies that the declarations are moved directly to the top of the current lexical environment, but instead the code is analyzed before execution; nothing is moved.
Note: let and const act in the same way and are also hoisted but this won't work:
function do_something() {
console.log(bar); //ReferenceError
let bar = 111;
console.log(bar);
}
This will give you a ReferenceError for trying to access an uninitialized variable. Even though let and const declarations are hoisted, the specification explicitly states that you cannot access them before they are initialized, unlike var:
13.3.1 Let and Const Declarations
let and const declarations define variables that are scoped to the running execution context’s LexicalEnvironment. The variables are created when their containing Lexical Environment is instantiated but may not be accessed in any way until the variable’s LexicalBinding is evaluated.
Thus, you can't access the variable until it is formally initialized, whether to undefined or any other value. That means you can't seemingly 'access it before it's declared' like you can with var.
"Interpreted" doesn't mean what you think it does.
Actually, "interpreted" here means more like "compiled on demand" and, rather than being compiled line by line (as you thought), it is compiled in units of executable code. Those units are first read into memory and then later, executed.
It's during these phases that the scope of the execution context becomes known, declarations are hoisted and identifiers resolved.
The particulars of the implementations of all this are not standardized and each vendor is free to implement them as they like.
Related
Just Started learning about JS.
CASE-1 Please look at the below given image.
My Understanding for this behavior : JS interpreter on reaching line 9 which commands to execute function a(); will create a new execution context for function a (or we can say allocate memory in heap for execution context which is an object and point this to it, right ??). Now i know that interpreter will first go through the whole function body and will allocate space for variable declarations and function declaration, in the execution context created in heap. This is evident from the right side of the image where in local Scope we have a reference variable b to the lambda of function. And now after having a pass through whole function body interpreter will execute code. Now since we already have the function b stored in a's execution context (or a's execution context knows about b),it will execute it. (This is what hoisting is, right ??)
So far so good. But,
Now look at this image :
Now if according to my concepts which i mentioned above,
On right side inside Local we must have a variable b referencing to function lambda, but its not.
What am i missing ??
Is my concepts wrong ??
Is it because of Chrome console ??
Can you explain this behavior ??
CASE-2 : Ok Then i did another experiment to know the behavior of Interpreter :
In both of the above given images, we have space allotted to variable a referencing to lambda of function, in both cases. This is completely opposite behavior to case-1.
Can you explain this behavior ??
Small Request (If you can..): If you can use the terms stack/heaps/memory/execution context object/link/pointers instead of Lexical environment, scopes, closures, chain etc etc it will be much preferred since they all are quite confusing. Its easy for me to understand things using above mentioned terms.
Nothing strange here. V8 compiles JavaScript directly into machine code. As a result, unreachable code, such as function b, is removed when it is never referenced. This is a common property of almost every compiler.
What am i missing ?? Is my concepts wrong ?? Is it because of Chrome console ?? Can you explain this behavior ??
In this code snippet, which is different than the specific example of case-1, namely in that the call to b has been removed, the b function has been removed by the compiler. As a result, you do not see any reference to it in the debugger.
case-2
In your examination of case-2, you overlook the fact that the debugger is stopped in the wrong place to analyze the interior scope of function a. As a result, you see a in the debugging window and that is it.
Your understanding section
JS interpreter on reaching line 9 which commands to execute function a(); will create a new execution context for function a (or we can say allocate memory in heap for execution context which is an object and point this to it, right ??)
Not entirely.
line 9: execute function a() [correct]
create an execution context for function a [correct]
allocate memory in heap [incorrect]
execution context is an object [correct]
point this to it [incorrect]
The execution context is an object, however, the compiled function is a static function frame stored on the stack. The this binding is a separate value which does not reference the execution context, but instead provides an entrance for variable scoping.
Now i know that interpreter will first go through the whole function body and will allocate space for variable declarations and function declaration, in the execution context created in heap.
This is incorrect. As noted, the interpreter is working on compiled code. This code already has an established memory footprint. A pointer is created on the heap which points to the compiled code in the stack.
This is evident from the right side of the image where in local Scope we have a reference variable b to the lambda of function.
b is not a lambda function, that would have been var b = () => console.log('Hello i am B');, and would not have been hoisted. That aside, under the hood, a is just a nested context. b is also compiled, and its static reference is just a component of the already compiled a. Scoping and memory storage are two very different concepts.
And now after having a pass through whole function body interpreter will execute code. Now since we already have the function b stored in a's execution context (or a's execution context knows about b),it will execute it. (This is what hoisting is, right ??)
This code is direct, so the interpreter would create a pointer on the heap, and immediately execute the compiled code on the stack. a is executed, then inside of a, b is executed as noted above.
Hoisting is simply moving declarations to the tops of scopes for function declarations and var declarations, but not const or let. Hoisting simply makes your first example equivalent to this:
function a(){
function b(){
console.log('Hello i am B')
}
console.log('Heloo')
b()
}
a()
As I understand it, every time a JavaScript program begins running, the engine first creates an execution context, pushes this execution context into the call stack/execution stack, and then it creates a global object (window in the browser and global in Node) as well.
To create the execution context, the engine first goes through the creation phase, where it allocates space in memory for entire function definitions and variable declarations (hoisting). It maintains a reference to the outer scope (this creates the scope chain, but in the global execution context there isn't anything above it), and it also creates the this property within the execution context and sets it to the window object in the browser and module.exports in Node. Lastly, the engine then goes through the execution phase, where it executes the code line by line and assigns a value to each variable.
Am I right in differentiating the global execution context creation from the creation of the global object itself? I view both of them as operations that happen side by side but are not the exact same thing.
Yes, it's fair to say that the global context and the global object are separate concepts. One illustrating distinction is the this binding: a context defines what this refers to (in case of the global context: to the global object); whereas the global object has no property named "this".
At the same time, global context and global object are somewhat coupled insofar as local variables in the former are properties on the latter.
Note that "execution context" is mostly an abstract concept, that means an engine only has to behave "as if" it did what the spec describes. Chances are that high-performance engines will take certain shortcuts (e.g., optimized code might keep some local variables in registers or on the machine stack, never putting them into any context at all).
I understand the semantics that a closure holds a reference to a variable lengthen it's life cycle, makes primitive variables not limited by calling stack, and thus those variables captured by closures should be specially treated.
I also understand variables in same scope could be differently treated depends on whether it was captured by closures in now-days javascript engine. for example,
function foo(){
var a=2;
var b=new Array(a_very_big_number).join('+');
return function(){
console.log(a);
};
}
var b=foo();
as no one hold a reference to b in foo, there's no need to keep b in memory, thus memory used could be released as soon as foo returns(or even never created under furthur optimization).
My question is, why v8 seems to pack all variables referenced by all closures together in each calling context? for example,
function foo(){
var a=0,b=1,c=2;
var zig=function(){
console.log(a);
};
var zag=function(){
console.log(b);
};
return [zig,zag];
}
both zig and zag seems to hold a reference to a and b, even it's apparent that b is not available to zig. This could be awful when b is very big, and zig persists very long.
But stands on the point of view of the implementation, I can not understand why this is a must. Based on my knowledge, without calling eval, the scope chain can be determined before excution, thus the reference relationship can be determined. The engine should aware that when zig is no longer available, nether do a so the engine mark it as garbage.
Both chrome and firefox seems to obey the rule. Does standard say that any implementation must do this? Or this implementation is more practical, more efficient? I'm quite puzzling.
The main obstacle is mutability. If two closures share the same var then they must do so in a way that mutating it from one closure is visible in the other. Hence it is not possible to copy the values of referenced variables into each closure environment, like functional languages would do (where bindings are immutable). You need to share a pointer to a common mutable heap location.
Now, you could allocate each captured variable as a separate cell on the heap, instead of one array holding all. However, that would often be more expensive in space and time because you'd need multiple allocations and two levels of indirection (each closure points to its own closure environment, which points to each shared mutable variable cell). With the current implementation it's just one allocation per scope and one indirection to access a variable (all closures within a single scope point to the same mutable variable array). The downside is that certain life times are longer than you might expect. It's a trade-off.
Other considerations are implementation complexity and debuggability. With dubious features like eval and expectations that debuggers can inspect the scope chain, the scope-based implementation is more tractable.
The standard doesn't say anything about garbage collection, but gives some clues of what should happen.
Reference : Standard
An outer Lexical Environment may, of course, have its own outer
Lexical Environment. A Lexical Environment may serve as the outer
environment for multiple inner Lexical Environments. For example, if a
Function Declaration contains two nested Function Declarations then
the Lexical Environments of each of the nested functions will have as
their outer Lexical Environment the Lexical Environment of the current
execution of the surrounding function."
Section 13 Function definition
step 4: "Let closure be the result of creating a new Function object as specified in 13.2"
Section 13.2 "a Lexical Environment specified by Scope" (scope = closure)
Section 10.2 Lexical Environments:
"The outer reference of a (inner) Lexical Environment is a reference to the Lexical Environment that logically surrounds the inner Lexical Environment.
So, a function will have access to the environment of the parent.
For me variables are easy to understand in Javascript: if a variable is not in local scope then it is a field in the global object.
But what about Javascript commands? If i just write Javascript commands in a file (outside any function) then how will the Javascript engine interpret it?
----- file.js -----
console.log('hai:DDD');
--- end of file ---
Will it create some kind of "global" or "main" function type object with the commands and then execute it? What if there are multiple files with code?
I guess this question only applies to node.js because in browsers all Javascript code is event handlers
Javascript does not have a main function. It starts at the top and works it's way down to the bottom.
In Node.js, variables are stored in the module scope which basically means they're not quite global. In a way, you could imagine any code you run in Node.js as being wrapped up like this:
(function(exports, require, module, __filename, __dirname) {
...
})();
But you seem to have a misconception about the browser. Not all JS code is an event handler in the browser. If you just run a basic script in the browser it will also populate the global scope.
var myGlobal = "I'm available to everyone";
Javascript is, as the name implies, a scripting language to be interpreted by some Javascript interpreter. Thus, the "main function" can be thought of as the whole file, the entry point is at the first character of the first line in the script. Typically, the entirety of the functions the script is to perform are wrapped in a function that loads with the page, but this isn't necessary, just convenient.
There is no global function in JavaScript, but there are some similar concepts:
The Global Environment (10.2.3)
The global environment is a unique Lexical Environment which is
created before any ECMAScript code is executed. The global
environment’s Environment Record is an object environment
record whose binding object is the global object (15.1). The
global environment’s outer environment reference is null.
The Global Object (15.1)
An unique object used as the binding of the environment record of the global environment.
Global Code (10.4.1, 10.1)
Global code is source text that is treated as an ECMAScript Program. The global code of a particular Program does not include any source text that is parsed as part of a FunctionBody.
Global Execution Context (10.4.1.1)
An execution context of global code.
Here is a link to video.. watch this he explains how javascript works.
link to the video
And tool to visualize how JavaScript works.
link to the tool
In case if you want to run after the window is loaded there is window.onload , $(document).ready(); if you are using Jquery.
No, javascript is a scripting language, there is no point of insertion.
Lines of code are executed in the order they are encountered by the javascript interpreter.
If multiple files are included on the page, the functions and variables declared in them will be added to the global scope (unless they are declared in an anonymous function)
Is there a defined name for the execution context that encapsulates all others in JavaScript?
For example is it called the "global execution context". That phrase is not mentioned in the ES6 spec as far as I can find.
It's called "the global environment" as per the ES5 spec (or see the equivalent section of the ES6 spec):
The global environment is a unique Lexical Environment which is created before any ECMAScript code is executed. The global environment’s Environment Record is an object environment record whose binding object is the global object (15.1). The global environment’s outer environment reference is null.
Or perhaps you are looking for the "initial global execution context", whose lexical and variable environment are references to the global environment.
I don't believe there's any term more specific than simply "global environment" or "global environment record" (to draw a distinction with its binding object) which is used several places in the ES6 pecification such as §8.1 and §8.1.14 (draft spec, those section numbers may drift).