How JavaScript closures are garbage collected - javascript

I've logged the following Chrome bug, which has led to many serious and non-obvious memory leaks in my code:
(These results use Chrome Dev Tools' memory profiler, which runs the GC, and then takes a heap snapshot of everything not garbaged collected.)
In the code below, the someClass instance is garbage collected (good):
var someClass = function() {};
function f() {
var some = new someClass();
return function() {};
}
window.f_ = f();
But it won't be garbage collected in this case (bad):
var someClass = function() {};
function f() {
var some = new someClass();
function unreachable() { some; }
return function() {};
}
window.f_ = f();
And the corresponding screenshot:
It seems that a closure (in this case, function() {}) keeps all objects "alive" if the object is referenced by any other closure in the same context, whether or not if that closure itself is even reachable.
My question is about garbage collection of closure in other browsers (IE 9+ and Firefox). I am quite familiar with webkit's tools, such as the JavaScript heap profiler, but I know little of other browsers' tools, so I haven't been able to test this.
In which of these three cases will IE9+ and Firefox garbage collect the someClass instance?

As far as I can tell, this is not a bug but the expected behavior.
From Mozilla's Memory management page: "As of 2012, all modern browsers ship a mark-and-sweep garbage-collector." "Limitation: objects need to be made explicitly unreachable".
In your examples where it fails some is still reachable in the closure. I tried two ways to make it unreachable and both work. Either you set some=null when you don't need it anymore, or you set window.f_ = null; and it will be gone.
Update
I have tried it in Chrome 30, FF25, Opera 12 and IE10 on Windows.
The standard doesn't say anything about garbage collection, but gives some clues of what should happen.
Section 13 Function definition, step 4: "Let closure be the result of creating a new Function object as specified in 13.2"
Section 13.2 "a Lexical Environment specified by Scope" (scope = closure)
Section 10.2 Lexical Environments:
"The outer reference of a (inner) Lexical Environment is a reference to the Lexical Environment that logically
surrounds the inner Lexical Environment.
An outer Lexical Environment may, of course, have its own outer
Lexical Environment. A Lexical Environment may serve as the outer environment for multiple inner Lexical
Environments. For example, if a Function Declaration contains two nested Function Declarations then the Lexical
Environments of each of the nested functions will have as their outer Lexical Environment the Lexical
Environment of the current execution of the surrounding function."
So, a function will have access to the environment of the parent.
So, some should be available in the closure of the returning function.
Then why isn't it always available?
It seems that Chrome and FF is smart enough to eliminate the variable in some cases, but in both Opera and IE the some variable is available in the closure (NB: to view this set a breakpoint on return null and check the debugger).
The GC could be improved to detect if some is used or not in the functions, but it will be complicated.
A bad example:
var someClass = function() {};
function f() {
var some = new someClass();
return function(code) {
console.log(eval(code));
};
}
window.f_ = f();
window.f_('some');
In example above the GC has no way of knowing if the variable is used or not (code tested and works in Chrome30, FF25, Opera 12 and IE10).
The memory is released if the reference to the object is broken by assigning another value to window.f_.
In my opinion this isn't a bug.

I tested this in IE9+ and Firefox.
function f() {
var some = [];
while(some.length < 1e6) {
some.push(some.length);
}
function g() { some; } //removing this fixes a massive memory leak
return function() {}; //or removing this
}
var a = [];
var interval = setInterval(function() {
var len = a.push(f());
if(len >= 500) {
clearInterval(interval);
}
}, 10);
Live site here.
I hoped to wind up with an array of 500 function() {}'s, using minimal memory.
Unfortunately, that was not the case. Each empty function holds on to an (forever unreachable, but not GC'ed) array of a million numbers.
Chrome eventually halts and dies, Firefox finishes the whole thing after using nearly 4GB of RAM, and IE grows asymptotically slower until it shows "Out of memory".
Removing either one of the commented lines fixes everything.
It seems that all three of these browsers (Chrome, Firefox, and IE) keep an environment record per context, not per closure. Boris hypothesizes the reason behind this decision is performance, and that seems likely, though I'm not sure how performant it can be called in light of the above experiment.
If a need a closure referencing some (granted I didn't use it here, but imagine I did), if instead of
function g() { some; }
I use
var g = (function(some) { return function() { some; }; )(some);
it will fix the memory problems by moving the closure to a different context than my other function.
This will make my life much more tedious.
P.S. Out of curiousity, I tried this in Java (using its ability to define classes inside of functions). GC works as I had originally hoped for Javascript.

Heuristics vary, but a common way to implement this sort of thing is to create an environment record for each call to f() in your case, and only store the locals of f that are actually closed over (by some closure) in that environment record. Then any closure created in the call to f keeps alive the environment record. I believe this is how Firefox implements closures, at least.
This has the benefits of fast access to closed-over variables and simplicity of implementation. It has the drawback of the observed effect, where a short-lived closure closing over some variable causes it to be kept alive by long-lived closures.
One could try creating multiple environment records for different closures, depending on what they actually close over, but that can get very complicated very quickly and can cause performance and memory problems of its own...

Maintain State between function calls
Let’s say you have function add() and you would like it to add all the values passed to it in several calls and return the sum.
like
add(5); // returns 5
add(20); // returns 25 (5+20)
add(3); // returns 28 (25 + 3)
two way you can do this first is normal define a global variable
Of course, you can use a global variable in order to hold the total. But keep in mind that this dude will eat you alive if you (ab)use globals.
now latest way using closure with out define global variable
(function(){
var addFn = function addFn(){
var total = 0;
return function(val){
total += val;
return total;
}
};
var add = addFn();
console.log(add(5));
console.log(add(20));
console.log(add(3));
}());

function Country(){
console.log("makesure country call");
return function State(){
var totalstate = 0;
if(totalstate==0){
console.log("makesure statecall");
return function(val){
totalstate += val;
console.log("hello:"+totalstate);
return totalstate;
}
}else{
console.log("hey:"+totalstate);
}
};
};
var CA=Country();
var ST=CA();
ST(5); //we have add 5 state
ST(6); //after few year we requare have add new 6 state so total now 11
ST(4); // 15
var CB=Country();
var STB=CB();
STB(5); //5
STB(8); //13
STB(3); //16
var CX=Country;
var d=Country();
console.log(CX); //store as copy of country in CA
console.log(d); //store as return in country function in d

(function(){
function addFn(){
var total = 0;
if(total==0){
return function(val){
total += val;
console.log("hello:"+total);
return total+9;
}
}else{
console.log("hey:"+total);
}
};
var add = addFn();
console.log(add);
var r= add(5); //5
console.log("r:"+r); //14
var r= add(20); //25
console.log("r:"+r); //34
var r= add(10); //35
console.log("r:"+r); //44
var addB = addFn();
var r= addB(6); //6
var r= addB(4); //10
var r= addB(19); //29
}());

Related

JavaScript Closures and variable reference

I'm reading Cameron's, HTM5 JavaScript & JQuery. In his section on JavaScript and closures he gives this example:
function f2()
{
var i = 0;
return function() {
return ++i;
};
}
When the anonymous function was defined inside function f2 it “closed” over its environment as it existed at that point of time, and kept a copy of that environment. Since the variable i was accessible when the function was declared, it is still available when the function is invoked. JavaScript has realised that the anonymous function refers to the variable i, and that this function has not been destroyed, and therefore it has not destroyed the i variable it depends on.
In this bold section where he writes "JavaScript has realised..." does this mean that when JS identifies a dependency between an enclosed variable (i.e. outside the closure) and a closure, that it retains the reference to the variable for later use, whereas if there was no dependency upon the variable it would be destroyed (garbage collected)? So var i below would be destroyed, whereas var i in the closure example above is not?
function f2()
{
var i = 0;
}
Cameron, Dane (2013-10-30). A Software Engineer Learns HTML5, JavaScript and jQuery: A guide to standards-based web applications (p. 74). Cisdal Publishing. Kindle Edition.
The short answer to your question is 'yes, that is correct' perhaps a longer example will help?
function main() {
var i = 0;
var int = setInterval(
function() {
console.log(++i);
if ( i > 9 ) {
clearInterval(int);
}
}, 100);
}
As per the example you gave, the variable i is referenced from the inner function, and is therefore kept around for as long as that inner function is in use.
In this example, int is also kept alive for the same reason, but here we also demonstrate how the GC will clean up when it can. Once i > 9 the Interval timer is cleared, meaning that there is no longer a reference to the inner function. This then also means that the variables i and int referenced by that inner function are no-longer referenced, meaning that the GC can destroy them all.

Does V8 do garbage collection on individual pieces of a scope?

I'm interested in whether V8 does garbage collection on the contents of individual variables within a scope or whether it only does garbage collection on the entire scope?
So, if I have this code:
function run() {
"use strict";
var someBigVar = whatever;
var cnt = 0;
var interval = setInterval(function() {
++cnt;
// do some recurring action
// interval just keeps going
// no reference to someBigVar in here
}, 1000);
someBigVar = somethingElse;
}
run();
Will V8 garbage collect someBigVar? The closure in run() remains alive because of the setInterval() callback and obviously the cnt variable is still being used so the whole scope of run() cannot be garbage collected. But, there is no actual ongoing reference to someBigVar.
Does V8 only garbage collect an entire scope at a time? So, the scope of run() can't be garbage collected until the interval is stopped? Or is it smart enough to garbage collect someBigVar because it can see that there is no code in the interval callback that actually references someBigVar?
FYI, here's an interesting overview article on V8 garbage collection (it does not address this specific question).
Yes, it does. Only variables that are actually used inside of a closure are retained. Otherwise closure had to capture EVERYTHING that is defined in the outer scope, which could be a lot.
The only exception is if you use eval inside of a closure. Since there is no way to statically determine, what is referenced by the argument of eval, engines have to retain everything.
Here is a simple experiment to demonstrate this behaviour using weak module (run with --expose-gc flag):
var weak = require('weak');
var obj = { val: 42 };
var ref = weak(obj, function() {
console.log('gc');
});
setInterval(function() {
// obj.val;
gc();
}, 100)
If there is no reference to ref inside of the closure you will see gc printed.

speed of accessing closure variable vs object variable

Consider the following code:
var xx=1;
var ff=function(){
return xx+1;
}
ff();
var gg=function(){
return gg.xx+1;
}
gg.xx=1;
gg();
Should there be any clear performance difference between these two approaches? It seems to me that the ff function should perform faster since it references only one variable whereas the gg function references two variables. I am developing a game and want to exploit every possible speed trick that I can.
This has been asked many times before. The only difference here is that neither example would normally be called a closure, they are simple cases of variable and property resolution.
In the case of:
var xx = 1;
var ff = function(){
return xx + 1;
}
then within the function, xx must first be resolved on the local variable object, and then on the scope chain. So that's two lookups at least.
In the case of:
var gg = function(){
return gg.xx + 1;
}
gg.xx = 1;
within the function, gg must be resolved in exactly the same way as the first case (i.e. on the local variable object and then on the scope chain), which again is two lookups. Having found gg, its properties must be searched find xx, which may involve a number of lookups.
Given the above, it's logical to assume the first is faster.
Of course, that's just a logical deduction, performance may actually be counter to that. In some browsers, global variable lookup is faster than local regardless of the length of the scope chain. Go figure.
It is certain that performance will be different in different browsers, regardless of which way it goes. Such performance tweaks (if there is any performance benefit at all) are playing at the margins and should be treated as premature optimisation.
Edit
To code this as a closure requires something like;
var gg = (function() {
var g;
return function() {
gg = function() {
return g.xx + 1; // Here is the closure
}
if (typeof g == 'undefined') {
g = gg;
}
if (typeof g.xx == 'undefined') {
g.xx = 1;
}
return g();
}
}());
since gg doesn't have a value until the IIFE finishes, so the closure can only be created at that point, the value can only be assigned later, when the function is fist run.
Note that g must still be resolved on the local variable object, then on the scope chain so still two lookups and no gain from the closure (at least no logical gain).
Edit 2
Just to be clear regarding closures:
var xx = 1;
var ff = function(){
return xx + 1;
}
Does technically not form a closure, but not one worth recognising. The identifier xx is resolved on the scope chain, there are no variables on the scope chain that are accessible by ff when some outer execution context completes. So the closure exists only for as long as the function does and therefore is no more remarkable than lexical scope.
In contrast:
var ff = (function() {
var closureVariable;
// This "inner" function has a closure with closureVariable
// If value is undefined, get (return) the value. Otherwise, set it
return function(value) {
if (typeof value == 'undefined') {
return closureVariable;
}
closureVariable = value;
};
}());
In this case, ff has exclusive access to closureVariable, which is a variable that remains accessible after the function that created it has completed:
// set the value
ff('foo');
// get the value
console.log(ff()); // foo
closureVariable is only accessible by ff (unlike global variables) and persists over numerous calls (unlike local variables). It's this feature of closures that allows them to emulate private members.
Another feature is that many functions can have a closure (or priveliged access) to the same variable, emulating a kind of inheritance.

mark and sweep in javascript(context variable)

i am reading Professional JavaScript for Web Developers
i got problem when reading "When the garbage collector runs, it marks all variables stored in memory. It then clears its mark off of variables that are in context and variables that are referenced by in-context variables."
i know when the object could not be reached by any variables, the memory associated would be reclaimed.
What does "variables that are in context" mean? Are they variables that could be found in the scope chain? But what about the "variables that are referenced by in-context variables"?
i am confused.
I'm assuming it's to avoid accidentally deleting variables used in a closure. In javascript, just like any other functional language, just being unreachable is not enough to tell you weather you should delete an object.
Take for example the following code:
function a () {
var x=0;
return function () {
alert(x++);
}
}
var b = a();
// at this point, the instance of `x` created by calling `a` is
// no longer reachable but we are still using it in the closure.
If we follow just the "unreachability" rule then the closure created would lose the variable x.
Consider this:
(function(){
var sobriety = [];
window.inception = function() {
var i = 0,
j = 0,
inner_level = { i: i },
level = { level: inner_level },
food = {};
return function() {
var new_level = {
level: level.level
};
new_level[i] = 'step ' + i;
new_level.level.i = i;
sobriety[i++] = new_level;
};
};
window.show_my_sobriety = function() { console.log(sobriety); };
})();
var agent = inception();
agent(); agent(); agent();
show_my_sobriety();​
JS Fiddle.
I admit this example is somewhat sophisticated, but I just had to make it to show the difference between i (a primitive) and inner_level (a reference type).
Here we have a module with one sobriety variable local to it, and two functions made global (by assigning them to properties of window object). Note that these global functions will have access to sobriety variable even after the module which has it defined is finished (in-context).
inception function, when invoked, defines five variables: two scalar (i and j) and three reference (inner_level, level and food), then defines a function and return it.
This function apparently access i and level (the same context), and sobriety (the outer level context) - but not j and food. Hence latter would be collected by GC right after window.inception is complete; the former, though, stay uncollected - because they're referred by the inner functions.
Now the tricky part. While you don't see access for inner_level in this function, it's still accessed - as it's a value of level property of the same-named object. And, when you check the results, you'd see that all three elements have the same level.i value - equal to 2. That's what's understood by "variables that are referenced by in-context variables".

Can binding out-of scope variables speed up your code?

I've been doing a lot of work in Node JS recently, and it's emphasis asynchronous modules has me relying on applying the bind function on closures to wrap asynchronous calls within loops (to preserve the values of variables at function call).
This got me thinking. When you bind variables to a function, you add passed values to that function's local scope. So in Node (or any JS code that refers to out of scope variables often), is it advantageous to bind out of scope variables (such as modules) to functions so that when used they are part of the local scope?
Example in plain JS:
var a = 1,
func1 = function(b) { console.log(a,b); },
func2 = (function(a,b) { console.log(a,b); }).bind(null, a);
//func1(2) vs func2(2)
Example in Node
var fs = require('fs'),
func1 = function(f) { fs.stat(f, function(err, stats){}); },
func2 = (function(fs, f) { fs.stat(f, function(err, stats){}); }).bind(null, fs);
//func1('file.txt') vs func2('file.txt')
In my above examples, will func1 or func2 be noticeably faster than the other (not including outside factors such as how long it takes to get file stats)?
Here's a little JSFiddle I threw together that does a quick and dirty benchmark: http://jsfiddle.net/AExvz/
Google Chrome 14.0.797.0 dev-m
Func1: 2-4ms
Func2: 30-46ms
Google Chrome 14.0.800.0 canary
Func1: 2-7ms
Func2: 35-39ms
Firefox 5.0
Func1: 0-1ms
Func2: 35-42ms
Opera 11.11 Build 2109
Func1: 21-32ms
Func2: 68-73ms
Safari 5.05 (7533.21.1)
Func1: 23-34ms
Func2: 71-78ms
Internet Explorer 9.0.8112.16421
Func1: 10-17ms
Func2: 14-17ms
Node 0.4.8 REPL
Func1: 10ms
Func2: 156ms # 10x more iterations (~15.6ms if both tested with 100000 iterations)
Note: Node's REPL test is unreliable because it must employ some sort of caching system. After a single benchmark of func1, func2 returned 0ms.
Feel free to contribute your results of a better benchmark.
Generally the effect of reducing scope lookups should be positive. However, the difference is probably rather miniscule on today's fast JS engines.
In some math-intensive code running on an older JS engine, I used to get some more perf by doing things like this:
function doSomething() {
var round = Math.round;
var floor = Math.floor;
//Do something that calls floor and round a lot
}
So basically bringing functions from outside the function to inside the function's own scope can have a positive effect, but to be sure you probably should profile the code to be sure.
As some of the users in your comments have pointed out, the bind function adds some overhead so it's not really an accurate comparison. You should test it by calling the function with arguments rather than by wrapping it with another function to bind arguments to it.
Here's a test to demonstrate (original test by cwolves):
http://jsperf.com/outer-vs-inner-references/2
Setup:
var x = 10, y = 11, z = 12, z = 13, a = 14, g = 15;
Test Case #1 (Outer reference):
(function(){
for(var i=0; i<1000; i++){
x + y + z + a + g
}
})();
Test Case #2 (Local reference):
(function(x,y,z,a,g){
for(var i=0; i<1000; i++){
x + y + z + a + g;
}
})(x,y,z,a,g);
Results:
According to this test, the second test case is much faster than the first case. Honestly, I was a bit surprised and am wondering if my own test is flawed. I knew it would be faster but figured the differences would be negligible - but apparently not?
Based on some benchmarks I completed (see question) and Jani's advice, it seems that on today's new-age browsers scope problems have been alleviated with fast engines like V8. In theory, decreasing the number of scope look-ups should increase speed, but the tests didn't support this.
For those specifically dealing with Node.JS, it seems like the only overhead you need to worry about is the first iteration of a function. When something is called repeated times in Node it seems like the V8 engine is able to cache part of the function's execution for later use. To avoid this caching a larger number of iterations was used for func2. Simple math showed that after scaling the test for func2, it was approximately 5.6ms slower than func1. Given the fluctuation you can see in most browsers, I would guess that both probably dance around values between 5ms and 15ms. I would recommend however, sticking with the func1 method as it seemed to have a slight edge and is more widely supported (I'm looking at you IE lt 9).

Categories

Resources