Javascript: why the access to closure variable might be slow

Javascript: why the access to closure variable might be slow - javascript

Recently I've read this performance guide Let's make the web faster and was puzzled by "Avoiding pitfalls with closures" recommendations (as if these advices were given for CommonLisp users where variable scoping is dynamic):
var a = 'a';
function createFunctionWithClosure() {
var b = 'b';
return function () {
var c = 'c';
a;
b;
c;
};
}
var f = createFunctionWithClosure();
f();
when f is invoked, referencing a is slower than referencing b, which is slower than referencing c.
It's quite evident that referencing local variable c is faster than b, but if the iterpreter is written properly (without dynamic scoping - something like chained hashtable lookup..) the speed difference should be only marginal. Or not?

You're right. Modern JS engines will optimize the scope chain lookup and the prototype chain lookup like a lot. Means, AFAIK the engine trys to hold some sort of hash-table with access nodes underneath.
This only works if there is no eval() (explicity or implicitly, e.g. setTimeout) or a try-catch clause or a with statement invoked. Because of such constructs, the interpretator can't be sure about how to access data and it needs to "fallback" into a classical scope chain lookup which really means, it has to crawl through all parent context variable / activation objects and try to resolve the searched variable name.
This process of, of course, will take more time for objects/names which are located "far away" from where the lookup processed was started. That in turn means, that accessing data on the global object will always be the slowest.
In your snippet, the lookup process for a would go like
anonymous function -> Execution Context -> Activation Object (not found)
anonymous function -> Execution Context -> [[ Scope ]]
- createFunctionWithClosure
- global scope
createFunctionWithClosure -> Activation Object (not found)
global scope -> Variable Object (found)
The described lookup procedure is for ECMAscript Edition 262 3rd edition. ECMAscript edition 5 has some fundamental changes there.

Related

JavaScript Closure in Chrome

There is the following code
function fn1 () {
const a = 1;
const b = 2;
const c = 3;
function fn2() {
console.log("xx");
}
function fn3() {
console.log(a);
console.log(c);
}
return fn2;
}
const clo = fn1();
console.dir(clo);
When I was run this code in Chrome Console, I got this:
I know fn3 references the variable of fn1, but I just return the fn2, why there is a closure in fn2 [[Scopes]]?
Hope that people who know this reason to give some advice! thank you.

Both fn2 and fn3 are created within fn1, so both are closures over (have a reference to) the lexical environment object created for the call to fn1. (The lexical environment object is what holds the bindings for a, b, c, and a few other things.) That's true whether they use that link (like fn3 does, by accessing fn1's local variables) or not (fn2).
Some JavaScript engines have experimented with optimizing that away when possible: removing a function's link to the environment where it was created (linking it directly to the outer environment instead). IIRC, V8 (in Chrome) did experiment with doing that, but found that in most cases the runtime cost of doing that analysis was more expensive than just letting the closure remain, and so didn't carry that work forward when they replaced the internals of the engine a few years back.
Separately, some engines have experimented with optimizing the content of closures — removing the bindings from the lexical environment object that aren't used by any of the functions closing over it. From your screenshot, it looks like V8 does still do that part, at least in trivial cases like that one, since only the bindings that fn3 uses are listed (a and c, not b). At a specification level, all of those bindings would be retained (not that we could tell in our code), but engines are free to optimize where the optimization doesn't lead to out-of-specification behavior.

When you are returning fn2 a closure is created that makes all the available variables in the outer scope available.
JS doesn't check if you actually use those variables, they are added to the closure none the less.

Are JavaScript execution contexts and JavaScript objects the same thing from a logical point of view?

My understanding of JavaScript is that on script execution a global execution context is created - what I understand to be a series of key:value pairs in reserved memory space - much like a regular JavaScript object.
On function execution, a new execution context is created with access to the 'parent' execution context. This seems (to me) at least to be the equivalent of re-initializing the JavaScript execution environment. Except that non of the key:value pairs initially set up in the global object need to be added to the new execution context since the child execution context has access to the parent's key:values pairs.
So for example:
function Y() {
this.prop = 4;
};
Y.prototype.doY = function() {
console.log(this.prop);
};
function X(){
this.prop = 5;
this.y = Object.create(Y.prototype);
Y.call(this.y);
};
// I can instantiate the object like so:
var x = new X();
// Which is equivalent to the following:
var x = Object.create(X.prototype);
X.call(x);
// Then I can add a function to the object that the X constructor points to via the X.prototype reference:
X.prototype.doX = function() {
console.log(this.prop)
};
// And then i can execute doX() in the context of the x object
x.doX(); // 5
x.y.doY(); // 4
On execution, doX function creates an execution context referenced from within the x object. The x object in turn is referenced within the global object.
Likewise, on execution the doY function creates an execution context that is referenced from within the x.y object.
It seems to me that an execution context as created by function execution is basically equivalent to a JavaScript object from a logical point of view:
Both allow for variable declaration NOT visible to parent objects/execution contexts
Both contain some kind of internal reference to parent object/execution context
It seems that both objects and execution context are just a list of key:values, the whole list being referenced by a single (ONLY a single) parent object (i.e. the y object exists as a reference within the x execution context) - or the global object as the root parent object (the x object exists as a reference within the global execution context).
So the question is: Is JavaScript execution context the same as a JavaScript object? If not, what is the difference?
This question: Are Execution Context and Variable Object actually same thing in JavaScript? mentions that in terms of implementation, that an execution context and an object are NOT the same thing.
Would it be correct to say that execution context inherits from Object? Or is implementation of objects/execution contexts completely non-related... One difference that I can see is that on creation, an execution context is 'processed' from top to bottom (lexically speaking) allowing an author to specify imperative commands - i.e. instruct the computer to do things. But in terms of architecture, this difference seems like an extension of the idea of an object rather than something completely different.
It seems to me that a 'running JavaScript environment' if such a thing exists, is basically a tree. The root node is the global execution context, and creating object adds nodes to the root node, or (as in the case of y above), adds nodes to the child nodes. Child nodes then reference parent nodes via a property to allow for scoping to parent execution context.
Then in terms of closures, which involve creation (and execution) of an execution context, on return the resultant object seems EXACTLY like a regular object since the 'execution context' as referenced by the closure will never again be executed (i.e. the same function, re-executed would create a new execution context and closure).
So from an "applied" point of view, is there ever a time that this -> i.e. a self reference to the current execution context, is NOT the same as the object from which a function call is made (aside from when using call, apply, or bind)? i.e. in the case of x.y.doY(), the function doY is called from the x.y object.

Execution contexts are conceptually objects, yes, as are several other things in the specification. That doesn't make them the "same" as JavaScript objects. They're a specification device and may not literally exist at runtime at all; code cannot directly reference them. For the full picture, read Executable Code and Execution Contexts in the spec.
Is JavaScript execution context the same as a JavaScript object?
No. A JavaScript object can be directly referenced by code, has a prototype, and has specific behaviors described for it which are not described for execution contexts, etc.
Would it be correct to say that execution context inherits from Object?
No, nothing in the spec suggests that to be the case.
It seems to me that a 'running JavaScript environment' if such a thing exists, is basically a tree.
Sort of, fairly indirectly. The spec doesn't describe any link from an execution context to its parent or child contexts. Instead, the lexical environment objects which are part of the context's state (specifically the LexicalEnvironment and VariableEnvironment state components) have a link to their parent lexical environment for the purposes of binding resolution. (The parent lexenv doesn't have a link to its children in the spec.) But they, too, are purely a specification device which may not literally exist at runtime and cannot be directly referenced by code.
So it's absolutely correct to think of them as objects, that's how the spec describes them, but they aren't JavaScript objects in the way that term is normally understood — something code can directly reference and act upon.

No, they are not the same thing. Not at all.
First, you seem to be confusing execution contexts with environment records. The former are basically call stack frames, the latter ones store variable values and form a tree as you described. Let's talk about environment records.
Environment records feature a key-value mapping and a link to their parent environment. In that regard, they are similar to a JS object with its properties and prototype link indeed. In fact, in a with statement or the global scope a so called object environment record is backed by an actual JS object.
However, they are still very different. Most environment records are declarative, that means their shape is determined before they are used. They are like an object that is not extensible. Also they cannot have getter/setter properties. And they differ in many more minor details.
But most importantly, environment records are just specification type values. They "are specification artefacts that do not necessarily correspond to any specific entity within an ECMAScript implementation. Specification type values may be used to describe intermediate results of ECMAScript expression evaluation but such values cannot be stored as properties of objects or values of ECMAScript language variables." So no, they do not inherit from Object at all.

Your question seems to ask: what can execution contexts record do that cannot be recorded in javascript objects?
You've got well documented answers already. So my contribution will be practical examples.
How would you manage the following 2 examples from a key/value perspective:
// Execution context of anonymous functions
(function (x) {
var y = x+1;
console.log(y);
}) (1);
// Life cycle of objects versus reference
var x = {a:1};
var y = x;
x = {b:1};
console.log(y);
If you start inventing keys that are not in the code, you should realize that this approach is limited. And so, not working if you push beyond the limits.

Over how much of its enclosing scope does a (javascript) closure close?

When I have some function that uses variables from its enclosing scope(s) and use that function outside of this scope (these scopes), then this is called a closure.
Is there any specification about over "how much" of its enclosing scope(s) a function has to close? (Or, put differently, over "how less" it absolutely needs to close?)
Consider:
function Outer() {
var bigGuy = createSomethingHuge();
var tinyNumber = 42;
return (function () { /* CONTENTS */ });
}
Or even:
function Layer1() {
var bigOne = somethingHugePlease();
var Layer2 = function() {
var bigToo = morePlease();
var Layer3 = function() {
var huge = reallyHuge();
var tiny = 42;
return (function () { /* CONTENTS */ });
};
return Layer3();
};
return Layer2();
}
Which of these variables does the final returned function close over? Does it depend on the contents of that final function (eval ...?)?
I'm mostly interested into whether there is some kind of specification for these cases, not so much about the behavior of some specific implementation.

I'm mostly interested into whether there is some kind of specification for these cases
The ECMAScript specification does not really detail this. It simply says that a function closes over the whole lexical environment which includes all variables in all parent scopes, organised in so-called environment records.
Yet it does not specify how an implementation should do garbage-collection - so engines do have to optimise their closures themselves - and they typically do, when they can deduce that some "closed over" variable is never needed (referenced). Specifically, if you do use eval anywhere in the closure, they cannot do that of course, and have to retain everything.
not so much about the behavior of some specific implementation
Regardless, you'll want to have a look at How JavaScript closures are garbage collected, garbage collection with node.js, About closure, LexicalEnvironment and GC and How are closures and scopes represented at run time in JavaScript

The scope chain of a nested function contains references to activation objects of all outer functions whose execution resulted in definition of the nested function. These activation objects store ALL values in place when the outer function call returned and continue to exist because they are in a scope chain.
So a closure, by definition, captures ALL variable values in scope. eval("typeof bigGuy"); within 'function () { /* CONTENTS */ } should demonstrate this.
The ECMA standards probably* cover this (if you are writing a javascript engine and have the time). A solution may be to set large sized variables to undefined when their value is no longer required.
*Based on reading ECMA 3.61 ten years ago. The computer science term "closure" did not appear in the standard. The production of closures has to be deduced by the reader as an implicit consequence of how nested function scope chains are constructed - which is in the standard. Scope chains comprise the currently executing function's activation object, outer functions' activation objects in reverse order, followed by the global object. The scope chain cannot be modified in user code.

deeper understanding of closure in Javascript

I was reading the comments on an answer and saw this comment:
[the closure] doesn't persist the state of foo so much as creates a special scope containing (1) the returned function and (2) all the external variables referenced at the time of the return. This special scope is called a closure.
OK, so far so good. Now here is the interesting part that I didn't know about:
Case in point... if you had another var defined in foo that was not referenced in the return function, it would not exist in the closure scope.
I guess that makes sense, but what implications does this have other than memory use / perfomance?
Question -- If all variables in scope were included in the closure, what would that allow me to do that I cannot do with the current model?

I think you're taking that comment too literally. The comment is just saying that you can't access it outside the function scope (it's not publicly accessible), not that its not available at all within the function. The returned function will have access to all of the outer functions scope no matter what. You just can't access that scope outside the outer function if the inner function doesn't provide a way of accessing it.
For instance, this expression evaluates to 4:
function testClosure(){
var x = 2;
return function(y){
alert(eval(y));
}
}
var closure = testClosure();
closure("x+2"); //4
http://jsfiddle.net/dmRcH/
So x is available despite not being directly referenced
Further research
It appears that chrome and firefox at least do attempt to optimize this in the sense that if you're not providing ANY way to reference the x variable, it doesn't show up as being available in the debugger. Running this with a breakpoint inside a closure shows x as unavailable on Chrome 26 and Firefox 18.
http://jsfiddle.net/FgekX/1/
But thats just a memory management detail, not a relevant property of the language. If there is any possible way that you could reference the variable, it is passed, and my suspicion is that other browsers may not optimize this in the same way. Its always better to code to the spec than to an implementation. In this case though the rule really is: "if there's any possible way for you to access it, it will be available". And also, don't use eval because it really will keep your code from optimizing anything.

if you had another var defined in foo that was not referenced in the return function, it would not exist in the closure scope.
This is not entirely accurate; the variable is part of the closure scope, even though it may not be directly referenced inside the function itself (by looking at the function code). The difference is how the engines optimize unused variables.
For instance, unused variables in the closure scope are known to cause memory leaks (on certain engines) when you're working with DOM elements. Take this classical example for instance:
function addHandler() {
var el = document.getElementById('el');
el.onclick = function() {
this.style.backgroundColor = 'red';
}
}
Source
In above code, memory is leaked (in both IE and Mozilla at least) because el is part of the closure scope for the click handler function, even though it's not referenced; this causes a cyclic reference which can't be removed because their garbage collection is based on reference counting.
Chrome, on the other hand, uses a different garbage collector:
In V8, the object heap is segmented into two parts: new space where objects are created, and old space to which objects surviving a garbage collection cycle are promoted. If an object is moved in a garbage collection cycle, V8 updates all pointers to the object.
This is also referred to as a generational or ephemeral garbage collector. Albeit more complicated, this type of garbage collector can more accurately establish whether a variable is used or not.

JavaScript has no inherent sense of privacy so we use function scope (closure) to simulate this functionality.
The SO answer you reference is an example of the Module pattern which Addy Osmani does a great job explaining the importance of in Learning JavaScript Design Patterns:
The Module pattern encapsulates "privacy", state and organization
using closures. It provides a way of wrapping a mix of public and
private methods and variables, protecting pieces from leaking into the
global scope and accidentally colliding with another developer's
interface. With this pattern, only a public API is returned, keeping
everything else within the closure private.

Please have a look below code:
for(var i=0; i< 5; i++){
setTimeout(function(){
console.log(i);
},1000);
}
Here what will be output? 0,1,2,3,4 not that will be 5,5,5,5,5 because of closure
So how it will solve? Answer is below:
for(var i=0; i< 5; i++){
(function(j){ //using IIFE
setTimeout(function(){
console.log(j);
},1000);
})(i);
}
Let me simple explain, when a function created nothing happen until it called so for loop in 1st code called 5 times but not called immediately so when it called i.e after 1 second and also this is asynchronous so before this for loop finished and store value 5 in var i and finally execute setTimeout function five time and print 5,5,5,5,5
Here how it solve using IIFE i.e Immediate Invoking Function Expression
(function(j){ //i is passed here
setTimeout(function(){
console.log(j);
},1000);
})(i); //look here it called immediate that is store i=0 for 1st loop, i=1 for 2nd loop, and so on and print 0,1,2,3,4
For more, please understand execution context to understand closure.
- There is one more solution to solve this using let (ES6 feature) but under the hood above function is worked
for(let i=0; i< 5; i++){
setTimeout(function(){
console.log(i);
},1000);
}
Output: 0,1,2,3,4

About closure, LexicalEnvironment and GC

as ECMAScriptv5, each time when control enters a code, the enginge creates a LexicalEnvironment(LE) and a VariableEnvironment(VE), for function code, these 2 objects are exactly the same reference which is the result of calling NewDeclarativeEnvironment(ECMAScript v5 10.4.3), and all variables declared in function code are stored in the environment record componentof VariableEnvironment(ECMAScript v5 10.5), and this is the basic concept for closure.
What confused me is how Garbage Collect works with this closure approach, suppose I have code like:
function f1() {
var o = LargeObject.fromSize('10MB');
return function() {
// here never uses o
return 'Hello world';
}
}
var f2 = f1();
after the line var f2 = f1(), our object graph would be:
global -> f2 -> f2's VariableEnvironment -> f1's VariableEnvironment -> o
so as from my little knowledge, if the javascript engine uses a reference counting method for garbage collection, the object o has at lease 1 refenrence and would never be GCed. Appearently this would result a waste of memory since o would never be used but is always stored in memory.
Someone may said the engine knows that f2's VariableEnvironment doesn't use f1's VariableEnvironment, so the entire f1's VariableEnvironment would be GCed, so there is another code snippet which may lead to more complex situation:
function f1() {
var o1 = LargeObject.fromSize('10MB');
var o2 = LargeObject.fromSize('10MB');
return function() {
alert(o1);
}
}
var f2 = f1();
in this case, f2 uses the o1 object which stores in f1's VariableEnvironment, so f2's VariableEnvironment must keep a reference to f1's VariableEnvironment, which result that o2 cannot be GCed as well, which further result in a waste of memory.
so I would ask, how modern javascript engine (JScript.dll / V8 / SpiderMonkey ...) handles such situation, is there a standard specified rule or is it implementation based, and what is the exact step javascript engine handles such object graph when executing Garbage Collection.
Thanks.

tl;dr answer: "Only variables referenced from inner fns are heap allocated in V8. If you use eval then all vars assumed referenced.". In your second example, o2 can be allocated on the stack and is thrown away after f1 exits.
I don't think they can handle it. At least we know that some engines cannot, as this is known to be the cause of many memory leaks, as for example:
function outer(node) {
node.onclick = function inner() {
// some code not referencing "node"
};
}
where inner closes over node, forming a circular reference inner -> outer's VariableContext -> node -> inner, which will never be freed in for instance IE6, even if the DOM node is removed from the document. Some browsers handle this just fine though: circular references themselves are not a problem, it's the GC implementation in IE6 that is the problem. But now I digress from the subject.
A common way to break the circular reference is to null out all unnecessary variables at the end of outer. I.e., set node = null. The question is then whether modern javascript engines can do this for you, can they somehow infer that a variable is not used within inner?
I think the answer is no, but I can be proven wrong. The reason is that the following code executes just fine:
function get_inner_function() {
var x = "very big object";
var y = "another big object";
return function inner(varName) {
alert(eval(varName));
};
}
func = get_inner_function();
func("x");
func("y");
See for yourself using this jsfiddle example. There are no references to either x or y inside inner, but they are still accessible using eval. (Amazingly, if you alias eval to something else, say myeval, and call myeval, you DO NOT get a new execution context - this is even in the specification, see sections 10.4.2 and 15.1.2.1.1 in ECMA-262.)
Edit: As per your comment, it appears that some modern engines actually do some smart tricks, so I tried to dig a little more. I came across this forum thread discussing the issue, and in particular, a link to a tweet about how variables are allocated in V8. It also specifically touches on the eval problem. It seems that it has to parse the code in all inner functions. and see what variables are referenced, or if eval is used, and then determine whether each variable should be allocated on the heap or on the stack. Pretty neat. Here is another blog that contains a lot of details on the ECMAScript implementation.
This has the implication that even if an inner function never "escapes" the call, it can still force variables to be allocated on the heap. E.g.:
function init(node) {
var someLargeVariable = "...";
function drawSomeWidget(x, y) {
library.draw(x, y, someLargeVariable);
}
drawSomeWidget(1, 1);
drawSomeWidget(101, 1);
return function () {
alert("hi!");
};
}
Now, as init has finished its call, someLargeVariable is no longer referenced and should be eligible for deletion, but I suspect that it is not, unless the inner function drawSomeWidget has been optimized away (inlined?). If so, this could probably occur pretty frequently when using self-executing functions to mimick classes with private / public methods.
Answer to Raynos comment below. I tried the above scenario (slightly modified) in the debugger, and the results are as I predict, at least in Chrome:
When the inner function is being executed, someLargeVariable is still in scope.
If I comment out the reference to someLargeVariable in the inner drawSomeWidget method, then you get a different result:
Now someLargeVariable is not in scope, because it could be allocated on the stack.

There is no standard specifications of implementation for GC, every engine have their own implementation. I know a little concept of v8, it has a very impressive garbage collector (stop-the-world, generational, accurate). As above example 2, the v8 engine has following step:
create f1's VariableEnvironment object called f1.
after created that object the V8 creates an initial hidden class of f1 called H1.
indicate the point of f1 is to f2 in root level.
create another hidden class H2, based on H1, then add information to H2 that describes the object as having one property, o1, store it at offset 0 in the f1 object.
updates f1 point to H2 indicated f1 should used H2 instead of H1.
creates another hidden class H3, based on H2, and add property, o2, store it at offset 1 in the f1 object.
updates f1 point to H3.
create anonymous VariableEnvironment object called a1.
create an initial hidden class of a1 called A1.
indicate a1 parent is f1.
On parse function literal, it create FunctionBody. Only parse FunctionBody when function was called.The following code indicate it not throw error while parser time
function p(){
return function(){alert(a)}
}
p();
So at GC time H1, H2 will be swept, because no reference point that.In my mind if the code is lazily compiled, no way to indicate o1 variable declared in a1 is a reference to f1, It use JIT.

if the javascript engine uses a reference counting method
Most javascript engine's use some variant of a compacting mark and sweep garbage collector, not a simple reference counting GC, so reference cycles do not cause problems.
They also tend to do some tricks so that cycles that involve DOM nodes (which are reference counted by the browser outside the JavaScript heap) don't introduce uncollectible cycles. The XPCOM cycle collector does this for Firefox.
The cycle collector spends most of its time accumulating (and forgetting about) pointers to XPCOM objects that might be involved in garbage cycles. This is the idle stage of the collector's operation, in which special variants of nsAutoRefCnt register and unregister themselves very rapidly with the collector, as they pass through a "suspicious" refcount event (from N+1 to N, for nonzero N).
Periodically the collector wakes up and examines any suspicious pointers that have been sitting in its buffer for a while. This is the scanning stage of the collector's operation. In this stage the collector repeatedly asks each candidate for a singleton cycle-collection helper class, and if that helper exists, the collector asks the helper to describe the candidate's (owned) children. This way the collector builds a picture of the ownership subgraph reachable from suspicious objects.
If the collector finds a group of objects that all refer back to one another, and establishes that the objects' reference counts are all accounted for by internal pointers within the group, it considers that group cyclical garbage, which it then attempts to free. This is the unlinking stage of the collectors operation. In this stage the collector walks through the garbage objects it has found, again consulting with their helper objects, asking the helper objects to "unlink" each object from its immediate children.
Note that the collector also knows how to walk through the JS heap, and can locate ownership cycles that pass in and out of it.
EcmaScript harmony is likely to include ephemerons as well to provide weakly held references.
You might find "The future of XPCOM memory management" interesting.

Develop Reference

JavaScript is the programming language of the Web.