When and How JavaScript garbage collector works - javascript

I did read few articles like this on MDN and this one I got the idea of how GC happens in JavaScript
I still don't understand things like
a) When does Garbage collector kicks in ( it gets called after some interval or some conditions have to met) ?
b) Who is responsible for Garbage collection ( it's part of JavaScript engine or browser/Node ) ?
c) runs on main thread or separate thread ?
d) which one of the following have higher peak memory usage ?
// first-case
// variables will be unreachable after each cycle
(function() {
for (let i = 0; i < 10000; i++) {
let name = 'this is name' + i;
let index = i;
}
})()
// second-case
// creating variable once
(function() {
let i, name, index;
for (i = 0; i < 10000; i++) {
name = 'this is name' + i;
index = i;
}
})()

V8 developer here. The short answer is: it's complicated. In particular, different JavaScript engines, and different versions of the same engine, will do things differently.
To address your specific questions:
a) When does Garbage collector kicks in ( it gets called after some interval or some conditions have to met) ?
Depends. Probably both. Modern garbage collectors often are generational: they have a relatively small "young generation", which gets collected whenever it is full. Additionally they have a much larger "old generation", where they typically do their work in many small steps, so as to never interrupt execution for too long. One common way to trigger such a small step is when N bytes (or objects) have been allocated since the last step. Another way, especially in modern tabbed browsers, is to trigger GC activity when a tab is inactive or in the background. There may well be additional triggers beyond these two.
b) Who is responsible for Garbage collection ( it's part of JavaScript engine or browser/Node ) ?
The garbage collector is part of the JavaScript engine. That said, it must have certain interactions with the respective embedder to deal with embedder-managed objects (e.g. DOM nodes) whose lifetime is tied to JavaScript objects in one way or another.
c) runs on main thread or separate thread ?
Depends. In a modern implementation, typically both: some work happens in the background (in one or more threads), some steps are more efficient to do on the main thread.
d) which one of the following have higher peak memory usage ?
These two snippets will (probably) have the same peak memory usage: neither of them ever lets objects allocated by more than one iteration be reachable at the same time.
Edit: if you want to read more about recent GC-related work that V8 has been doing, you can find a series of blog posts here: https://v8.dev/blog/tags/memory

Related

Does this for loop iterate multiple times?

I have been discussing some code with colleagues:
for(const a of arr) {
if(a.thing)
continue;
// do a thing
}
A suggestion was to filter this and use a forEach
arr.filter(a => !a.thing)
.forEach(a => /* do a thing */);
There was a discussion about iterating more than necessary. I've looked this up, and I can't find anything. I also tried to figure out how to view the optimized output, but I don't know how to do that either.
I would expect that the filter and forEach turn into code that is very much like the for of with the continue, but I don't know how to be sure.
How can I find out? The only thing I've tried so far is google.
Your first example (the for in loop) is O(n), which will execute n times (n being the size of the array).
Your second example (the filter forEach) is O(n+m), which will execute n times in the filter (n being the size of the array), and then m times (m being the size of the resulting array after the filter takes place).
As such, the first example is faster. However, in this type of example without an exceedingly large sample set the difference is probably measured in microseconds or nanoseconds.
With regards to compilation optimization, that is essentially all memory access optimization. The major interpreters and engines will all analyze issues in code relating to function, variable, and property access such as how often and what the shape of the access graph looks like; and then, with all of that information, optimize their hidden structure to be more efficient for access. Essentially no optimization so far as loop replacement or process analysis is done on the code as it for the most part is optimized while it is running (if a specific part of code does start taking an excessively long time, it may have its code optimized).
When first executing the JavaScript code, V8 leverages full-codegen which directly translates the parsed JavaScript into machine code without any transformation. This allows it to start executing machine code very fast. Note that V8 does not use intermediate bytecode representation this way removing the need for an interpreter.
When your code has run for some time, the profiler thread has gathered enough data to tell which method should be optimized.
Next, Crankshaft optimizations begin in another thread. It translates the JavaScript abstract syntax tree to a high-level static single-assignment (SSA) representation called Hydrogen and tries to optimize that Hydrogen graph. Most optimizations are done at this level.
-https://blog.sessionstack.com/how-javascript-works-inside-the-v8-engine-5-tips-on-how-to-write-optimized-code-ac089e62b12e
*While continue may cause the execution to go to the next iteration, it still counts as an iteration of the loop.
The right answer is "it really doesn't matter". Some previously posted answer states that the second approach is O(n+m), but I beg to differ. The same exact "m" operations will also run in the first approach. In the worst case, even if you consider the second batch of operations as "m" (which doesn't really make much sense - we're talking about the same n elements given as input - that's not how complexity analysis works), in the worst case m==n and the complexity will be O(2n), which is just O(n) in the end anyway.
To directly answer your question, yes, the second approach will iterate over the collection twice while the first one will do it only once. But that probably won't make any difference to you. In cases like these, you probably want to improve readability over efficiency. How many items does your collection have? 10? 100? It's better to write code that will be easier to maintain over time than to strive for maximum efficiency all the time - because most of the time it just doesn't make any difference.
Moreover, iterating the same collection more than once doesn't mean your code runs slower. It's all about what's inside each loop. For instance:
for (const item of arr) {
// do A
// do B
}
Is virtually the same as:
for (const item of arr) {
// do A
}
for (const item of arr) {
// do B
}
The for loop itself doesn't add any significant overhead to the CPU. Although you would probably want to write a single loop anyway, if your code readability is improved when you do two loops, go ahead and do it.
Efficiency is about picking the right algorithm
If you really need to be efficient, you don't want to iterate through the whole collection, not even once. You want some smarter way to do it: either divide and conquer (O(log n)) or use hash maps (O(1)). A hash map a day keeps the inefficiency away :-)
Do things only once
Now, back to your example, if I find myself iterating over and over and doing the same operation every time, I'd just run the filtering operation only once, at the beginning:
// during initialization
const things = [];
const notThings = [];
for (const item of arr) {
item.thing ? things.push(item) : notThings.push(item);
}
// now every time you need to iterate through the items...
for (const a of notThings) { // replaced arr with notThings
// if (a.thing) // <- no need to check this anymore
// continue;
// do a thing
}
And then you can freely iterate over notThings, knowing that unwanted items were already filtered out. Makes sense?
Criticism to "for of is faster than calling methods"
Some people like to state that for of will always be faster than calling forEach(). We just cannot say that. There are lots of Javascript interpreters out there and for each one there are different versions, each with its particular ways of optimizing things. To prove my point, I was able to make filter() + forEach() run faster than for of in Node.js v10 on macOS Mojave:
const COLLECTION_SIZE = 10000;
const RUNS = 10000;
const collection = Array.from(Array(COLLECTION_SIZE), (e, i) => i);
function forOf() {
for (const item of collection) {
if (item % 2 === 0) {
continue;
}
// do something
}
}
function filterForEach() {
collection
.filter(item => item % 2 === 0)
.forEach(item => { /* do something */ });
}
const fns = [forOf, filterForEach];
function timed(fn) {
if (!fn.times) fn.times = [];
const i = fn.times.length;
fn.times[i] = process.hrtime.bigint();
fn();
fn.times[i] = process.hrtime.bigint() - fn.times[i];
}
for (let r = 0; r < RUNS; r++) {
for (const fn of fns) {
timed(fn);
}
}
for (const fn of fns) {
const times = fn.times;
times.sort((a, b) => a - b);
const median = times[Math.floor(times.length / 2)];
const name = fn.constructor.name;
console.info(`${name}: ${median}`);
}
Times (in nanoseconds):
forOf: 81704
filterForEach: 32709
for of was consistently slower in all tests I ran, always around 50% slower. That's the main point of this answer: Do not rely on an interpreter's implementation details, because that can (and will) change over time. Unless you're developing for embedded or high-efficiency/low-latency systems -- where you need to be as close to the hardware as possible -- get to know your algorithm complexities first.
An easy way to see how many times each part of that statement is called would be to add log statements like so and run it in the Chrome console
var arr = [1,2,3,4];
arr.filter(a => {console.log("hit1") ;return a%2 != 0;})
.forEach(a => {console.log("hit2")});
"Hit1" should print to the console 4 times regardless in this case. If it were to iterate too many times, we'd see "hit2" output 4 times, but after running this code it only outputs twice. So your assumption is partially correct, that the second time it iterates, it doesn't iterate over the whole set. However it does iterate over the whole set once in the .filter and then iterates again over the part of the set that matches the condition again in the .filter
Another good place to look is in the MDN developer docs here especially in the "Polyfill" section which outlines the exact equivalent algorithm and you can see that .filter() here returns the variable res, which is what .forEach would be performed upon.
So while it overall iterates over the set twice, in the .forEach section it only iterates over the part of the set that matches the .filter condition.

Why is <= slower than < using this code snippet in V8?

I am reading the slides Breaking the Javascript Speed Limit with V8, and there is an example like the code below. I cannot figure out why <= is slower than < in this case, can anybody explain that? Any comments are appreciated.
Slow:
this.isPrimeDivisible = function(candidate) {
for (var i = 1; i <= this.prime_count; ++i) {
if (candidate % this.primes[i] == 0) return true;
}
return false;
}
(Hint: primes is an array of length prime_count)
Faster:
this.isPrimeDivisible = function(candidate) {
for (var i = 1; i < this.prime_count; ++i) {
if (candidate % this.primes[i] == 0) return true;
}
return false;
}
[More Info] the speed improvement is significant, in my local environment test, the results are as follows:
V8 version 7.3.0 (candidate)
Slow:
time d8 prime.js
287107
12.71 user
0.05 system
0:12.84 elapsed
Faster:
time d8 prime.js
287107
1.82 user
0.01 system
0:01.84 elapsed
Other answers and comments mention that the difference between the two loops is that the first one executes one more iteration than the second one. This is true, but in an array that grows to 25,000 elements, one iteration more or less would only make a miniscule difference. As a ballpark guess, if we assume the average length as it grows is 12,500, then the difference we might expect should be around 1/12,500, or only 0.008%.
The performance difference here is much larger than would be explained by that one extra iteration, and the problem is explained near the end of the presentation.
this.primes is a contiguous array (every element holds a value) and the elements are all numbers.
A JavaScript engine may optimize such an array to be an simple array of actual numbers, instead of an array of objects which happen to contain numbers but could contain other values or no value. The first format is much faster to access: it takes less code, and the array is much smaller so it will fit better in cache. But there are some conditions that may prevent this optimized format from being used.
One condition would be if some of the array elements are missing. For example:
let array = [];
a[0] = 10;
a[2] = 20;
Now what is the value of a[1]? It has no value. (It isn't even correct to say it has the value undefined - an array element containing the undefined value is different from an array element that is missing entirely.)
There isn't a way to represent this with numbers only, so the JavaScript engine is forced to use the less optimized format. If a[1] contained a numeric value like the other two elements, the array could potentially be optimized into an array of numbers only.
Another reason for an array to be forced into the deoptimized format can be if you attempt to access an element outside the bounds of the array, as discussed in the presentation.
The first loop with <= attempts to read an element past the end of the array. The algorithm still works correctly, because in the last extra iteration:
this.primes[i] evaluates to undefined because i is past the array end.
candidate % undefined (for any value of candidate) evaluates to NaN.
NaN == 0 evaluates to false.
Therefore, the return true is not executed.
So it's as if the extra iteration never happened - it has no effect on the rest of the logic. The code produces the same result as it would without the extra iteration.
But to get there, it tried to read a nonexistent element past the end of the array. This forces the array out of optimization - or at least did at the time of this talk.
The second loop with < reads only elements that exist within the array, so it allows an optimized array and code.
The problem is described in pages 90-91 of the talk, with related discussion in the pages before and after that.
I happened to attend this very Google I/O presentation and talked with the speaker (one of the V8 authors) afterward. I had been using a technique in my own code that involved reading past the end of an array as a misguided (in hindsight) attempt to optimize one particular situation. He confirmed that if you tried to even read past the end of an array, it would prevent the simple optimized format from being used.
If what the V8 author said is still true, then reading past the end of the array would prevent it from being optimized and it would have to fall back to the slower format.
Now it's possible that V8 has been improved in the meantime to efficiently handle this case, or that other JavaScript engines handle it differently. I don't know one way or the other on that, but this deoptimization is what the presentation was talking about.
I work on V8 at Google, and wanted to provide some additional insight on top of the existing answers and comments.
For reference, here's the full code example from the slides:
var iterations = 25000;
function Primes() {
this.prime_count = 0;
this.primes = new Array(iterations);
this.getPrimeCount = function() { return this.prime_count; }
this.getPrime = function(i) { return this.primes[i]; }
this.addPrime = function(i) {
this.primes[this.prime_count++] = i;
}
this.isPrimeDivisible = function(candidate) {
for (var i = 1; i <= this.prime_count; ++i) {
if ((candidate % this.primes[i]) == 0) return true;
}
return false;
}
};
function main() {
var p = new Primes();
var c = 1;
while (p.getPrimeCount() < iterations) {
if (!p.isPrimeDivisible(c)) {
p.addPrime(c);
}
c++;
}
console.log(p.getPrime(p.getPrimeCount() - 1));
}
main();
First and foremost, the performance difference has nothing to do with the < and <= operators directly. So please don't jump through hoops just to avoid <= in your code because you read on Stack Overflow that it's slow --- it isn't!
Second, folks pointed out that the array is "holey". This was not clear from the code snippet in OP's post, but it is clear when you look at the code that initializes this.primes:
this.primes = new Array(iterations);
This results in an array with a HOLEY elements kind in V8, even if the array ends up completely filled/packed/contiguous. In general, operations on holey arrays are slower than operations on packed arrays, but in this case the difference is negligible: it amounts to 1 additional Smi (small integer) check (to guard against holes) each time we hit this.primes[i] in the loop within isPrimeDivisible. No big deal!
TL;DR The array being HOLEY is not the problem here.
Others pointed out that the code reads out of bounds. It's generally recommended to avoid reading beyond the length of arrays, and in this case it would indeed have avoided the massive drop in performance. But why though? V8 can handle some of these out-of-bound scenarios with only a minor performance impact. What's so special about this particular case, then?
The out-of-bounds read results in this.primes[i] being undefined on this line:
if ((candidate % this.primes[i]) == 0) return true;
And that brings us to the real issue: the % operator is now being used with non-integer operands!
integer % someOtherInteger can be computed very efficiently; JavaScript engines can produce highly-optimized machine code for this case.
integer % undefined on the other hand amounts to a way less efficient Float64Mod, since undefined is represented as a double.
The code snippet can indeed be improved by changing the <= into < on this line:
for (var i = 1; i <= this.prime_count; ++i) {
...not because <= is somehow a superior operator than <, but just because this avoids the out-of-bounds read in this particular case.
TL;DR The slower loop is due to accessing the Array 'out-of-bounds', which either forces the engine to recompile the function with less or even no optimizations OR to not compile the function with any of these optimizations to begin with (if the (JIT-)Compiler detected/suspected this condition before the first compilation 'version'), read on below why;
Someone just has to say this (utterly amazed nobody already did):
There used to be a time when the OP's snippet would be a de-facto example in a beginners programming book intended to outline/emphasize that 'arrays' in javascript are indexed starting at 0, not 1, and as such be used as an example of a common 'beginners mistake' (don't you love how I avoided the phrase 'programing error' ;)): out-of-bounds Array access.
Example 1:
a Dense Array (being contiguous (means in no gaps between indexes) AND actually an element at each index) of 5 elements using 0-based indexing (always in ES262).
var arr_five_char=['a', 'b', 'c', 'd', 'e']; // arr_five_char.length === 5
// indexes are: 0 , 1 , 2 , 3 , 4 // there is NO index number 5
Thus we are not really talking about performance difference between < vs <= (or 'one extra iteration'), but we are talking:
'why does the correct snippet (b) run faster than erroneous snippet (a)'?
The answer is 2-fold (although from a ES262 language implementer's perspective both are forms of optimization):
Data-Representation: how to represent/store the Array internally in memory (object, hashmap, 'real' numerical array, etc.)
Functional Machine-code: how to compile the code that accesses/handles (read/modify) these 'Arrays'
Item 1 is sufficiently (and correctly IMHO) explained by the accepted answer, but that only spends 2 words ('the code') on Item 2: compilation.
More precisely: JIT-Compilation and even more importantly JIT-RE-Compilation !
The language specification is basically just a description of a set of algorithms ('steps to perform to achieve defined end-result'). Which, as it turns out is a very beautiful way to describe a language.
And it leaves the actual method that an engine uses to achieve specified results open to the implementers, giving ample opportunity to come up with more efficient ways to produce defined results.
A spec conforming engine should give spec conforming results for any defined input.
Now, with javascript code/libraries/usage increasing, and remembering how much resources (time/memory/etc) a 'real' compiler uses, it's clear we can't make users visiting a web-page wait that long (and require them to have that many resources available).
Imagine the following simple function:
function sum(arr){
var r=0, i=0;
for(;i<arr.length;) r+=arr[i++];
return r;
}
Perfectly clear, right? Doesn't require ANY extra clarification, Right? The return-type is Number, right?
Well.. no, no & no... It depends on what argument you pass to named function parameter arr...
sum('abcde'); // String('0abcde')
sum([1,2,3]); // Number(6)
sum([1,,3]); // Number(NaN)
sum(['1',,3]); // String('01undefined3')
sum([1,,'3']); // String('NaN3')
sum([1,2,{valueOf:function(){return this.val}, val:6}]); // Number(9)
var val=5; sum([1,2,{valueOf:function(){return val}}]); // Number(8)
See the problem ? Then consider this is just barely scraping the massive possible permutations...
We don't even know what kind of TYPE the function RETURN until we are done...
Now imagine this same function-code actually being used on different types or even variations of input, both completely literally (in source code) described and dynamically in-program generated 'arrays'..
Thus, if you were to compile function sum JUST ONCE, then the only way that always returns the spec-defined result for any and all types of input then, obviously, only by performing ALL spec-prescribed main AND sub steps can guarantee spec conforming results (like an unnamed pre-y2k browser).
No optimizations (because no assumptions) and dead slow interpreted scripting language remains.
JIT-Compilation (JIT as in Just In Time) is the current popular solution.
So, you start to compile the function using assumptions regarding what it does, returns and accepts.
you come up with checks as simple as possible to detect if the function might start returning non-spec conformant results (like because it receives unexpected input).
Then, toss away the previous compiled result and recompile to something more elaborate, decide what to do with the partial result you already have (is it valid to be trusted or compute again to be sure), tie in the function back into the program and try again. Ultimately falling back to stepwise script-interpretation as in spec.
All of this takes time!
All browsers work on their engines, for each and every sub-version you will see things improve and regress. Strings were at some point in history really immutable strings (hence array.join was faster than string concatenation), now we use ropes (or similar) which alleviate the problem. Both return spec-conforming results and that is what matters!
Long story short: just because javascript's language's semantics often got our back (like with this silent bug in the OP's example) does not mean that 'stupid' mistakes increases our chances of the compiler spitting out fast machine-code. It assumes we wrote the 'usually' correct instructions: the current mantra we 'users' (of the programming language) must have is: help the compiler, describe what we want, favor common idioms (take hints from asm.js for basic understanding what browsers can try to optimize and why).
Because of this, talking about performance is both important BUT ALSO a mine-field (and because of said mine-field I really want to end with pointing to (and quoting) some relevant material:
Access to nonexistent object properties and out of bounds array elements returns the undefined value instead of raising an exception. These dynamic features make programming in JavaScript convenient, but they also make it difficult to compile JavaScript into efficient machine code.
...
An important premise for effective JIT optimization is that programmers use dynamic features of JavaScript in a systematic way. For example, JIT compilers exploit the fact that object properties are often added to an object of a given type in a specific order or that out of bounds array accesses occur rarely. JIT compilers exploit these regularity assumptions to generate efficient machine code at runtime. If a code block satisfies the assumptions, the JavaScript engine executes efficient, generated machine code. Otherwise, the engine must fall back to slower code or to interpreting the program.
Source:
"JITProf: Pinpointing JIT-unfriendly JavaScript Code"
Berkeley publication,2014, by Liang Gong, Michael Pradel, Koushik Sen.
http://software-lab.org/publications/jitprof_tr_aug3_2014.pdf
ASM.JS (also doesn't like out off bound array access):
Ahead-Of-Time Compilation
Because asm.js is a strict subset of JavaScript, this specification only defines the validation logic—the execution semantics is simply that of JavaScript. However, validated asm.js is amenable to ahead-of-time (AOT) compilation. Moreover, the code generated by an AOT compiler can be quite efficient, featuring:
unboxed representations of integers and floating-point numbers;
absence of runtime type checks;
absence of garbage collection; and
efficient heap loads and stores (with implementation strategies varying by platform).
Code that fails to validate must fall back to execution by traditional means, e.g., interpretation and/or just-in-time (JIT) compilation.
http://asmjs.org/spec/latest/
and finally https://blogs.windows.com/msedgedev/2015/05/07/bringing-asm-js-to-chakra-microsoft-edge/
were there is a small subsection about the engine's internal performance improvements when removing bounds-check (whilst just lifting the bounds-check outside the loop already had an improvement of 40%).
EDIT:
note that multiple sources talk about different levels of JIT-Recompilation down to interpretation.
Theoretical example based on above information, regarding the OP's snippet:
Call to isPrimeDivisible
Compile isPrimeDivisible using general assumptions (like no out of bounds access)
Do work
BAM, suddenly array accesses out of bounds (right at the end).
Crap, says engine, let's recompile that isPrimeDivisible using different (less) assumptions, and this example engine doesn't try to figure out if it can reuse current partial result, so
Recompute all work using slower function (hopefully it finishes, otherwise repeat and this time just interpret the code).
Return result
Hence time then was:
First run (failed at end) + doing all work all over again using slower machine-code for each iteration + the recompilation etc.. clearly takes >2 times longer in this theoretical example!
EDIT 2: (disclaimer: conjecture based in facts below)
The more I think of it, the more I think that this answer might actually explain the more dominant reason for this 'penalty' on erroneous snippet a (or performance-bonus on snippet b, depending on how you think of it), precisely why I'm adament in calling it (snippet a) a programming error:
It's pretty tempting to assume that this.primes is a 'dense array' pure numerical which was either
Hard-coded literal in source-code (known excelent candidate to become a 'real' array as everything is already known to the compiler before compile-time) OR
most likely generated using a numerical function filling a pre-sized (new Array(/*size value*/)) in ascending sequential order (another long-time known candidate to become a 'real' array).
We also know that the primes array's length is cached as prime_count ! (indicating it's intent and fixed size).
We also know that most engines initially pass Arrays as copy-on-modify (when needed) which makes handeling them much more fast (if you don't change them).
It is therefore reasonable to assume that Array primes is most likely already an optimized array internally which doesn't get changed after creation (simple to know for the compiler if there is no code modifiying the array after creation) and therefore is already (if applicable to the engine) stored in an optimized way, pretty much as if it was a Typed Array.
As I have tried to make clear with my sum function example, the argument(s) that get passed higly influence what actually needs to happen and as such how that particular code is being compiled to machine-code. Passing a String to the sum function shouldn't change the string but change how the function is JIT-Compiled! Passing an Array to sum should compile a different (perhaps even additional for this type, or 'shape' as they call it, of object that got passed) version of machine-code.
As it seems slightly bonkus to convert the Typed_Array-like primes Array on-the-fly to something_else while the compiler knows this function is not even going to modify it!
Under these assumptions that leaves 2 options:
Compile as number-cruncher assuming no out-of-bounds, run into out-of-bounds problem at the end, recompile and redo work (as outlined in theoretical example in edit 1 above)
Compiler has already detected (or suspected?) out of bound acces up-front and the function was JIT-Compiled as if the argument passed was a sparse object resulting in slower functional machine-code (as it would have more checks/conversions/coercions etc.). In other words: the function was never eligable for certain optimisations, it was compiled as if it received a 'sparse array'(-like) argument.
I now really wonder which of these 2 it is!
To add some scientificness to it, here's a jsperf
https://jsperf.com/ints-values-in-out-of-array-bounds
It tests the control case of an array filled with ints and looping doing modular arithmetic while staying within bounds. It has 5 test cases:
1. Looping out of bounds
2. Holey arrays
3. Modular arithmetic against NaNs
4. Completely undefined values
5. Using a new Array()
It shows that the first 4 cases are really bad for performance. Looping out of bounds is a bit better than the other 3, but all 4 are roughly 98% slower than the best case.
The new Array() case is almost as good as the raw array, just a few percent slower.

How do I create a memory leak in JavaScript?

I would like to understand what kind of code causes memory leaks in JavaScript and created the script below. However, when I run the script in Safari 6.0.4 on OS X the memory consumption shown in the Activity Monitor does not really increase.
Is something wrong with my script or is this no longer an issue with modern browsers?
<html>
<body>
</body>
<script>
var i, el;
function attachAlert(element) {
element.onclick = function() { alert(element.innerHTML); };
}
for (i = 0; i < 1000000; i++) {
el = document.createElement('div');
el.innerHTML = i;
attachAlert(el);
}
</script>
</html>
The script is based on the Closure section of Google's JavaScript style guide:
http://google-styleguide.googlecode.com/svn/trunk/javascriptguide.xml?showone=Closures#Closures
EDIT: The bug that caused the above code to leak has apparently been fixed: http://jibbering.com/faq/notes/closures/#clMem
But my question remains: Would someone be able to provide a realistic example of JavaScript code that leaks memory in modern browsers?
There are many articles on the Internet that suggest memory leaks can be an issue for complex single page applications but I have a hard time finding an examples that I can run in my browser.
You're not keeping the element you've created around and referenced anywhere - that's why you're not seeing the memory usage increase. Try attaching the element to the DOM, or store it in an object, or set the onclick to be a different element that sticks around. Then you'll see the memory usage skyrocket. The garbage collector will come through and clean up anything that can no longer be referenced.
Basically a walkthrough of your code:
create element (el)
create a new function that references that
element
set the function to be the onclick of that element
overwrite the element with a new element
Everything is centric around the element existing. Once there isn't a way to access the element, the onclick can't be accessed anymore. So, since the onclick can't be accessed, the function that was created is destroyed.. and the function had the only reference to the element.. so the element is cleaned up as well.
Someone might have a more technical example, but that's the basis of my understanding of the javascript garbage collector.
Edit: Here's one of many possibilities for a leaking version of your script:
<html>
<body>
</body>
<script>
var i, el;
var createdElements = {};
var events = [];
function attachAlert(element) {
element.onclick = function() { alert(element.innerHTML); };
}
function reallyBadAttachAlert(element) {
return function() { alert(element.innerHTML); };
}
for (i = 0; i < 1000000; i++) {
el = document.createElement('div');
el.innerHTML = i;
/** posibility one: you're storing the element somewhere **/
attachAlert(el);
createdElements['div' + i] = el;
/** posibility two: you're storing the callbacks somewhere **/
event = reallyBadAttachAlert(el);
events.push(event);
el.onclick = event;
}
</script>
</html>
So, for #1, you're simply storing a reference to that element somewhere. Doesn't matter that you'll never use it - because that reference is made in the object, the element and its callbacks will never go away (or at least until you delete the element from the object). For possibility #2, you could be storing the events somewhere. Because the event can be accessed (i.e. by doing events[10]();) even though the element is nowhere to be found, it's still referenced by the event.. so the element will stay in memory as well as the event, until it's removed from the array.
update: Here is a very simple example based on the caching scenario in the Google I/O presentation:
/*
This is an example of a memory leak. A new property is added to the cache
object 10 times/second. The value of performance.memory.usedJSHeapSize
steadily increases.
Since the value of cache[key] is easy to recalculate, we might want to free
that memory if it becomes low. However, there is no way to do that...
Another method to manually clear the cache could be added, but manually
adding memory checks adds a lot of extra code and overhead. It would be
nice if we could clear the cache automatically only when memory became low.
Thus the solution presented at Google I/O!
*/
(function(w){
var cache = {}
function getCachedThing(key) {
if(!(key in cache)) {
cache[key] = key;
}
return cache[key];
}
var i = 0;
setInterval(function() {
getCachedThing(i++);
}, 100);
w.getCachedThing = getCachedThing
})(window);
Because usedJSHeapSize does not update when the page is opened from the local file system, you might not see the increasing memory usage. In that case, I have hosted this code for you here: https://memory-leak.surge.sh/example-for-waterfr
This Google I/O'19 presentation gives examples of real-world memory leaks as well as strategies for avoiding them:
Method getImageCached() returns a reference to an object, also caching a local reference. Even if this reference goes out of the method consumer's scope, the referenced memory cannot be garbage collected because there is a still a strong reference inside the implementation of getImageCached(). Ideally, the cached reference would be eligible for garbage collection if memory got too low. (Not exactly a memory leak, but a situation where there is memory that could be freed at the cost of running the expensive operations again).
Leak #1: the reference to the cached image. Solved by using weak references inside getImageCached().
Leak #2: the string keys inside the cache (Map object). Solved by using the new FinalizationGroup API.
Please see the linked video for JS code with line-by-line explanations.
More generally, "real" JS memory leaks are caused by unwanted references (to objects that will never be used again). They are usually bugs in the JS code. This article explains four common ways memory leaks are introduced in JS:
Accidental global variables
Forgotten timers/callbacks
Out of DOM references
Closures
An interesting kind of JavaScript memory leak documents how closures caused a memory leak in the popular MeteorJS framework.
2020 Update:
Most CPU-side memory overflow is no longer working on modern v8 engine based browser. However, we can overflow the GPU-side memory by running this script
// Initialize canvas and its context
window.reallyFatCanvas = document.createElement('canvas');
let context = window.reallyFatCanvas.getContext('2d');
// References new context inside context, in loop.
function leakingLoop() {
context.canvas.width = document.body.clientWidth;
context.canvas.height = document.body.clientHeight;
const newContext = document.createElement('canvas').getContext('2d');
context.context = newContext;
context.drawImage(newContext.canvas, 0, 0);
// The new context will reference another context on the next loop
context = newContext;
}
// Use interval instead of while(true) {...}
setInterval(leakingLoop,1);
EDIT: I rename every variables (and constants) so it makes a lot of sense. Here is the explanation.
Based on my observation, canvas context seems sync with Video Memory. So if we put reference of a canvas object which also reference another canvas object and so on, the Video RAM fills a lot more than DRAM, tested on microsoft edge and chrome.
This is my third attempt of screenshot:
I have no idea why my laptop always freeze seconds after taking screenshot while running this script. Please be careful if you want to try that script.
I tried to do something like that and got exception out of memory.
const test = (array) => {
array.push((new Array(1000000)).fill('test'));
};
const testArray = [];
for(let i = 0; i <= 1000; i++) {
test(testArray);
}
The Easiest Way Is:
while(true){}
A small example of code causing a 1MB memory leak:
Object.defineProperty(globalThis, Symbol(), {value: new Uint8Array(1<<20).slice(), writable: false, configurable: false})
After you run that code, the only way to free the leaked memory is to close the tab you ran it on.
If all you want is to create a memory leak, then the easiest way IMO is to instantiate a TypedArray since they hog up a fixed size of memory and outlive any references. For example, creating a Float64Array with 2^27 elements consumes 1GiB (1 Gibibyte) of memory since it needs 8 bytes per element.
Start the console and just write this:
new Float64Array(Math.pow(2, 27))

JavaScript Performance: Multiple variables or one object?

this is just a simple performance question, helping me understand the javascript engine.
for this I'm was wondering, what is faster: declaring multiple variables for certain values or using one object containing multiple values.
example:
var x = 15;
var y = 300;
vs.
var sizes = { x: 15, y: 300 };
this is just a very simple example, could of course differ in a real project.
does this even matter?
A complete answer for that question would be really long. So I'll try to explain a few things only. First, maybe most important fact, even if you declare a variable with var, it depends where you do that. In a global scope, you implicitly would also write that variable in an object, most browsers call it window. So for instance
// global scope
var x = 15;
console.log( window.x ); // 15
If we do the same thing within the context of a function things change. Within the context of a function, we would write that variable name into its such called 'Activation Object'. That is, an internal object which the js engine handles for you. All formal parameters, function declarations and variables are stored there.
Now to answer your actual question: Within the context of a function, its always the fastest possible access to have variables declared with var. This again is not necesarrily true if we are in the global context. The global object is very huge and its not really fast to access anything within.
If we store things within an object, its still very fast, but not as fast as variables declared by var. Especially the access times do increase. But nonetheless, we are talking about micro and nanoseconds here (in modern browser implementations). Old'ish browsers, especially IE6+7 have huge performance penalties when accessing object properties.
If you are really interested in stuff like this, I highyl recommend the book 'High Performance Javascript' by Nicholas C. Zakas. He measured lots of different techniques to access and store data in ECMAscript for you.
Again, performance differences for object lookups and variables declared by var is almost not measureable in modern browsers. Old'ish Browsers like FF3 or IE6 do show a fundamental slow performance for object lookups/access.
foo_bar is always faster than foo.bar in every modern browser (IE11+/Edge and any version of Chrome, FireFox, and Safari) and NodeJS so long as you see performance as holistic (which I recommend you should). After millions of iterations in a tight loop, foo.bar may approach (but never surpass) the same ops/s as foo_bar due to the wealth of correct branch predictions. Notwithstanding, foo.bar incurs a ton more overhead during both JIT compilation and execution because it is so much more complex of an operation. JavaScript that features no tight loops benefits an extra amount from using foo_bar because, in comparison, foo.bar would have a much higher overhead:savings ratio such that there was extra overhead involved in the JIT of foo.bar just to make foo.bar a little faster in a few places. Granted, all JIT engines intelligently try to guess how much effort should be put into optimizing what to minimize needless overhead, but there is still a baseline overhead incurred by processing foo.bar that can never be optimized away.
Why? JavaScript is a highly dynamic language, where there is costly overhead associated with every object. It was originally a tiny scripting executed line-by-line and still exhibits line-by-line execution behavior (it's not executed line-by-line anymore but, for example, one can do something evil like var a=10;eval('a=20');console.log(a) to log the number 20). JIT compilation is highly constrained by this fact that JavaScript must observe line-by-line behavior. Not everything can be anticipated by JIT, so all code must be slow in order for extraneous code such as is shown below to run fine.
(function() {"use strict";
// chronological optimization is very poor because it is so complicated and volatile
var setTimeout=window.setTimeout;
var scope = {};
scope.count = 0;
scope.index = 0;
scope.length = 0;
function increment() {
// The code below is SLOW because JIT cannot assume that the scope object has not changed in the interum
for (scope.index=0, scope.length=17; scope.index<scope.length; scope.index=scope.index+1|0)
scope.count = scope.count + 1|0;
scope.count = scope.count - scope.index + 1|0;
}
setTimeout(function() {
console.log( scope );
}, 713);
for(var i=0;i<192;i=i+1|0)
for (scope.index=11, scope.length=712; scope.index<scope.length; scope.index=scope.index+1|0)
setTimeout(increment, scope.index);
})();
(function() {"use strict";
// chronological optimization is very poor because it is so complicated and volatile
var setTimeout=window.setTimeout;
var scope_count = 0;
var scope_index = 0;
var scope_length = 0;
function increment() {
// The code below is FAST because JIT does not have to use a property cache
for (scope_index=0, scope_length=17; scope_index<scope_length; scope_index=scope_index+1|0)
scope_count = scope_count + 1|0;
scope_count = scope_count - scope_index + 1|0;
}
setTimeout(function() {
console.log({
count: scope_count,
index: scope_index,
length: scope_length
});
}, 713);
for(var i=0;i<192;i=i+1|0)
for (scope_index=4, scope_length=712; scope_index<scope_length; scope_index=scope_index+1|0)
setTimeout(increment, scope_index);
})();
Performing a one sample z-interval by running each code snippet above 30 times and seeing which one gave a higher count, I am 90% confident that the later code snippet with pure variable names is faster than the first code snippet with object access between 76.5% and 96.9% of the time. As another way to analyze the data, there is a 0.0000003464% chance that the data I collected was a fluke and the first snippet is actually faster. Thus, I believe it is reasonable to infer that foo_bar is faster than foo.bar because there is less overhead.
Don't get me wrong. Hash maps are very fast because many engines feature advanced property caches, but there will still always be enough extra overhead when using hash maps. Observe.
(function(){"use strict"; // wrap in iife
// This is why you should not pack variables into objects
var performance = window.performance;
var iter = {};
iter.domino = -1; // Once removed, performance topples like a domino
iter.index=16384, iter.length=16384;
console.log(iter);
var startTime = performance.now();
// Warm it up and trick the JIT compiler into false optimizations
for (iter.index=0, iter.length=128; iter.index < iter.length; iter.index=iter.index+1|0)
if (recurse_until(iter, iter.index, 0) !== iter.domino)
throw Error('mismatch!');
// Now that its warmed up, drop the cache off cold and abruptly
for (iter.index=0, iter.length=16384; iter.index < iter.length; iter.index=iter.index+1|0)
if (recurse_until(iter, iter.index, 0) !== iter.domino)
throw Error('mismatch!');
// Now that we have shocked JIT, we should be running much slower now
for (iter.index=0, iter.length=16384; iter.index < iter.length; iter.index=iter.index+1|0)
if (recurse_until(iter, iter.index, 0) !== iter.domino)
throw Error('mismatch!');
var endTime=performance.now();
console.log(iter);
console.log('It took ' + (endTime-startTime));
function recurse_until(obj, _dec, _inc) {
var dec=_dec|0, inc=_inc|0;
var ret = (
dec > (inc<<1) ? recurse_until(null, dec-1|0, inc+1|0) :
inc < 384 ? recurse_until :
// Note: do not do this in production. Dynamic code evaluation is slow and
// can usually be avoided. The code below must be dynamically evaluated to
// ensure we fool the JIT compiler.
recurse_until.constructor(
'return function(obj,x,y){' +
// rotate the indices
'obj.domino=obj.domino+1&7;' +
'if(!obj.domino)' +
'for(var key in obj){' +
'var k=obj[key];' +
'delete obj[key];' +
'obj[key]=k;' +
'break' +
'}' +
'return obj.domino' +
'}'
)()
);
if (obj === null) return ret;
recurse_until = ret;
return obj.domino;
}
})();
For a performance comparison, observe pass-by-reference via an array and local variables.
// This is the correct way to write blazingly fast code
(function(){"use strict"; // wrap in iife
var performance = window.performance;
var iter_domino=[0,0,0]; // Now, domino is a pass-by-reference list
var iter_index=16384, iter_length=16384;
var startTime = performance.now();
// Warm it up and trick the JIT compiler into false optimizations
for (iter_index=0, iter_length=128; iter_index < iter_length; iter_index=iter_index+1|0)
if (recurse_until(iter_domino, iter_index, 0)[0] !== iter_domino[0])
throw Error('mismatch!');
// Now that its warmed up, drop the cache off cold and abruptly
for (iter_index=0, iter_length=16384; iter_index < iter_length; iter_index=iter_index+1|0)
if (recurse_until(iter_domino, iter_index, 0)[0] !== iter_domino[0])
throw Error('mismatch!');
// Now that we have shocked JIT, we should be running much slower now
for (iter_index=0, iter_length=16384; iter_index < iter_length; iter_index=iter_index+1|0)
if (recurse_until(iter_domino, iter_index, 0)[0] !== iter_domino[0])
throw Error('mismatch!');
var endTime=performance.now();
console.log('It took ' + (endTime-startTime));
function recurse_until(iter_domino, _dec, _inc) {
var dec=_dec|0, inc=_inc|0;
var ret = (
dec > (inc<<1) ? recurse_until(null, dec-1|0, inc+1|0) :
inc < 384 ? recurse_until :
// Note: do not do this in production. Dynamic code evaluation is slow and
// can usually be avoided. The code below must be dynamically evaluated to
// ensure we fool the JIT compiler.
recurse_until.constructor(
'return function(iter_domino, x,y){' +
// rotate the indices
'iter_domino[0]=iter_domino[0]+1&7;' +
'if(!iter_domino[0])' +
'iter_domino.push( iter_domino.shift() );' +
'return iter_domino' +
'}'
)()
);
if (iter_domino === null) return ret;
recurse_until = ret;
return iter_domino;
}
})();
JavaScript is very different from other languages in that benchmarks can easily be a performance-sin when misused. What really matters is what should in theory run the fastest accounting for everything in JavaScript. The browser you are running your benchmark in right now may fail to optimize for something that a later version of the browser will optimize for.
Further, browsers are guided in the direction that we program. If everyone used CodeA that makes no performance sense via pure logic but is really fast (44Kops/s) only in a certain browser, other browsers will lean towards optimizing CodeA and CodeA may eventually surpass 44Kops/s in all browsers. On the other hand, if CodeA were really slow in all browsers (9Kops/s) but was very logical performance-wise, browsers would be able to take advantage of that logic and CodeA may soon surpass 900Kops/s in all browsers. Ascertaining the logical performance of code is very simple and very difficult. One must put themself in the shoes of the computer and imagine one has an infinite amount of paper, an infinite supply of pencils, and an infinite amount of time, and no ability to interpret the purpose/intention of the code. How can you structure your code to fare the best under such hypothetical circumstances? For example, hypothetically, the hash maps incurred by foo.bar would be a bit slower than doing foo_bar because foo.bar would require looking at the table named foo and finding the property named bar. You could put your finger on the location of the bar property to cache it, but the overhead of looking through the table to find bar costed time.
You are definitely micro-optimizing. I wouldn't worry about it until there is a demonstrable performance bottleneck, and you have narrowed the issue to using multiple vars vs a object with properties.
Logically thinking about it using the object approach requires three variable creations, one for the object, and one for each property on the object, vs 2 for just declaring variables. So having the object will have a higher memory approach. However, it is probably more efficient to pass an object to a method, than n > 1 variables to a method, since you only need to copy 1 value (javascript is pass by value). This also has implications for keeping track of the lexical scoping of the objects; i.e. passing less things to methods will use less memory.
however, i doubt the performance differences will even be quantifiable by any profiler.
Theory or questions like "What you are ..hmm.. doing, dude?", of course, can appear here as an answers. But I dont think it's good approach.
I just created two test benchs:
Specific, http://jsben.ch/SvNyw for global scope
It shows, for example, that on 07/2017 in Chromium based browsers (Vivaldi, Opera, Google Chrome and other) to achieve max performance there are preferable to use var. It works about 25% faster for reading values and 10% faster for writing ones.
Under Node.js there're about the same results - because of same JS engine.
In Opera Presto (12.18) there're the similar percentage test results as in chromium-based browsers.
In (modern) Firefox there is other and strange picture. Reading of global scope var is around the same as reading of object property, and writing of global scope var is dramatically slower than writing obj.prop (around twice slower). It seems like a bug.
For testing under IE/Edge or any others you are welcome.
Normal case, http://jsben.ch/5UvSZ for in-function local scope
In both Chromium based browsers and Mozilla Firefox you can see huge domination of simple var performance according to object property access. Local simple variables are several times (!) faster than dealing with object properties.
So,
if you need maximize some critical JavaScript code performance:
in browser - you can forced to make different optimizations for several browsers. I dont recommend! Or you can select some "favorite" browser, optimize your code for it and do not see what freezes happens in other ones. Not wery good, but is the way.
in browser, again - do you really need to optimize this way? May be something wrong in your algorithm / code logic?
in highload Node.js module (or other highload calcs) - well, try to minimize object "dots", with minimized damage to quality/readability of course - use var.
The safe optimization trick for any case - when you have too much operations with obj.subobj.* you can do var subobj = obj.subobj; and operate with subobj.*. This can ever improve readability.
In any case, think what do you need and do and make real bechmarks of you highload code.

Is it faster access an javascript array directly?

I was reading an article: Optimizing JavaScript for Execution Speed
And there is a section that says:
Use this code:
for (var i = 0; (p = document.getElementsByTagName("P")[i]); i++)
Instead of:
nl = document.getElementsByTagName("P");
for (var i = 0; i < nl.length; i++)
{
p = nl[i];
}
for performance reasons.
I always used the "wrong" way, according the article, but, am I wrong or is the article wrong?
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."
--Donald Knuth
Personally i would use your way because it is more readable and easier to maintain. Then i would use a tool such as YSlow to profile the code and iron out the performance bottlenecks.
If you look at it from a language like C#, you'd expect the second statement to be more efficient, however C# is not an interpreter language.
As the guide states: your browser is optimized to retrieved the right Nodes from live lists and does this a lot faster than retrieving them from the "cache" you define in your variable. Also you have to determine the length each iteration, with might cause a bit of performance loss as well.
Interpreter languages react differently from compiled languages, they're optimized in different ways.
Good question - I would assume not calling the function getElementsByTagName() every loop would save time. I could see this being faster - you aren't checking the length of the array, just that the value got assigned.
var p;
var ps = document.getElementsByTagName("P");
for (var i = 0; (p=ps[i]); i++) {
//...
}
Of course this also assumes that none of the values in your array evaluate to "false". A numeric array that may contain a 0 will break this loop.
Interesting. Other references do seem to back up the idea that NodeLists are comparatively heavyweight.
The question is ... what's the overhead? Is it enough to bother about? I'm not a fan of premature optimisation. However this is an interesting case, because it's not just the cost of iteration that's affected, there's extra overhead as the NodeList must be kept in synch with any changes to the DOM.
With no further evidence I tend to believe the article.
No it will not be faster. Actually it is nearly totally nonsense since for every step of the for loop, you call the "getElementsByTagName" which, is a time consuming function.
The ideal loop would be as follows:
nl = document.getElementsByTagName("P");
for (var i = nl.length-1; i >= 0; i--)
{
p = nl[i];
}
EDIT:
I actually tested those two examples you have given in Firebug using console.time and as everyone have though the first one took 1ms whereas the second one took 0ms =)
It makes sense to just assign it directly to the variable. Probably quicker to write as well. I'd say that the article might have some truth to it.
Instead of having to get the length everytime, check it against i, and then assign the variable, you simply check if p was able to be set. This cleans up the code and probably is actually faster.
I typically do this (move the length test outside the for loop)
var nl = document.getElementsByTagName("p");
var nll = nl.length;
for(var i=0;i<nll;i++){
p = nl[i];
}
or for compactness...
var nl = document.getElementsByTagName("p");
for(var i=0,nll=nl.length;i<nll;i++){
p = nl[i];
}
which presumes that the length doesn't change during access (which in my cases doesn't)
but I'd say that running a bunch of performance tests on the articles idea would be the definitive answer.
The author of the article wrote:
In most cases, this is faster than caching the NodeList. In the second example, the browser doesn't need to create the node list object. It needs only to find the element at index i at that exact moment.
As always it depends. Maybe it depends on the number of elements of the NodeList.
For me this approach is not safe if the number of elemets can change, this could cause and index out of bound.
From the article you link to:
In the second example, the browser doesn't need to create the node list object. It needs only to find the element at index i at that exact moment.
This is nonsense. In the first example, the node list is created and a reference to it is held in a variable. If something happens which causes the node list to change - say you remove a paragraph - then the browser has to do some work to update the node list. However, if your code doesn't cause the list to change, this isn't an issue.
In the second example, far from not needing to create the node list, the browser has to create the node list every time through the loop, then find the element at index i. The fact that a reference to the node list is never assigned to a variable doesn't mean the list doesn't have to be created, as the author seems to think. Object creation is expensive (no matter what the author says about browsers "being optimized for this"), so this is going to be a big performance hit for many applications.
Optimisation is always dependant on the actual real-world usage your application encounters. Articles such as this shouldn't be seen as saying "Always work this way" but as being collections of techniques, any one of which might, in some specific set of circumstances, be of value. The second example is, in my opinion, less easy to follow, and that alone puts it in the realm of tricksy techniques that should only be used if there is a proven benefit in a specific case.
(Frankly, I also don't trust advice offered by a programmer who uses a variable name like "nl". If he's too lazy to use meaningful names when writing a tutorial, I'm glad I don't have to maintain his production code.)
It is worthless to discuss theory when actual tests will be more accurate, and by comparing both the second method was clearly faster.
Here is sample code from a benchmark test
var start1 = new Date().getTime();
for (var j= 0; j < 500000; j++){
for (var i = 0; (p = document.getElementsByTagName("P")[i]); i++);
}
var end1 = new Date().getTime() - start1;
var start2 = new Date().getTime();
for (var j= 0; j < 500000; j++){
nl = document.getElementsByTagName("P");
for (var i = 0; i < nl.length; i++)
{
p = nl[i];
}
}
var end2 = new Date().getTime() - start2;
alert("first:" + end1 + "\nSecond:" + end2);
In chrome the first method was taking 2324ms while the second took 1986ms.
But note that for 500,000 iterations there was a difference only 300ms so I wouldn't bother with this at all.

Categories

Resources